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POLYPEPTIDES HAVING CELLOBIOHYDROLASE I ACTIVITY 
AND POLYNUCLEOTIDES ENCODING SAME 



Field of the Invention 

5 The present invention relates to polypeptides having cellobiohydrolase I (also referred to 

as CBH I or CBH 1) activity and polynucleotides having a nucleotide sequence which encodes 
for the polypeptides. The invention also relates to nucleic acid constructs, vectors, and host 
cells comprising the nucleic acid constructs as well as methods for producing and using the 
polypeptides. 

10 

Background of the Invention 

Cellulose is an important industrial raw material and a source of renewable energy. The 
physical structure and morphology of native cellulose are complex and the fine details of its 
structure have been difficult to determine experimentally. However, the chemical composition 

15 of cellulose is simple, consisting of D-glucose residues linked by beta-1,4-glycosidic bonds to 
form linear polymers with chains length of over 10.000 glycosidic residues. 

In order to be efficient, the digestion of cellulose requires several types of enzymes 
acting cooperatively. At least three categories of enzymes are necessary to convert cellulose 
into glucose: endo (1,4)-beta-D-glucanases (EC 3.2.1.4) that cut the cellulose chains at 

20 random; cellobiohydrolases (EC 3.2.1.91) which cleave cellobiosyl units from the cellulose 
chain ends and beta-glucosidases (EC 3.2.1.21) that convert cellobiose and soluble 
cellodextrins into glucose. Among these three categories of enzymes involved in the 
biodegradation of cellulose, cellobiohydrolases are the key enzymes for the degradation of 
native crystalline cellulose. 

25 Exo-cellobiohydrolases (Cellobiohydrolase I, or CBH I) refer to the cellobiohydrolases 

which degrade cellulose by hydrolyzing the cellobiose from the non-reducing end of the 
cellulose polymer chains. 

It is an object of the present invention to provide improved polypeptides having 
cellobiohydrolase I activity and polynucleotides encoding the polypeptides. The improved 

30 polypeptides may have improved specific activity and/or improved stability - in particular 
improved thermostability. The polypeptides may also have an improved ability to resist 
inhibition by cellobiose. 

Summary of the Invention 

35 In a first aspect the present invention relates to a polypeptide having cellobiohydrolase I 

activity, selected from the group consisting of: 
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(a) a polypeptide comprising an amino acid sequence selected from the group consisting of: 
an amino acid sequence which has at least 80% identity with amino acids 1 to 526 of 
SEQ ID NO:2, 

an amino acid sequence which has at least 80% identity with amino acids 1 to 529 of 
5 SEQ ID NO:4, 

an amino acid sequence which has at least 80% identity with amino acids 1 to 451 of 
SEQ ID NO:6, 

an amino acid sequence which has at least 80% identity with amino acids 1 to 457 of 
SEQ ID NO:8, 

10 an amino acid sequence which has at least 80% identity with amino acids 1 to 538 of 

SEQ ID NO:10, 

an amino acid sequence which has at least 70% identity with amino acids 1 to 415 of 
SEQ ID NO:12, 

an amino acid sequence which has at least 70% identity with amino acids 1 to 447 of 
15 SEQ ID NO:14, 

an amino acid sequence which has at least 80% identity with amino acids 1 to 452 of 
SEQ ID NO:16 t 

an amino acid sequence which has at least 80% identity with amino acids 1 to 454 of 
SEQ ID NO:38, 

20 an amino acid sequence which has at least 80% identity with amino acids 1 to 458 of 

SEQ ID NO:40, 

an amino acid sequence which has at least 80% identity with amino acids 1 to 450 of 
SEQ ID NO:42, 

an amino acid sequence which has at least 80% identity with amino acids 1 to 446 of 
25 SEQ ID NO:44, 

an amino acid sequence which has at least 80% identity with amino acids 1 to 527 of 
SEQ ID NO:46, 

an amino acid sequence which has at least 80% identity with amino acids 1 to 455 of 
SEQ ID NO:48, 

30 an amino acid sequence which has at least 80% identity with amino acids 1 to 464 of 

SEQ ID NO:50 t 

an amino acid sequence which has at least 80% identity with amino acids 1 to 460 of 
SEQ ID NO:52 t 

an amino acid sequence which has at least 80% identity with amino acids 1 to 450 of 
35 SEQ ID NO:54, 

an amino acid sequence which has at least 80% identity with amino acids 1 to 532 of 
SEQ ID NO:56, 

2 
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an amino acid sequence which has at least 80% identity with amino acids 1 to 460 of 
SEQ ID NO:58, 

an amino acid sequence which has at least 80% identity with amino acids 1 to 525 of 
SEQ ID NO:60, and 

5 an amino acid sequence which has at least 80% identity with amino acids 1 to 456 of 

SEQ ID NO:66; 



(b) a polypeptide comprising an amino acid sequence selected from the group consisting of: 
an amino acid sequence which has at least 80% identity with the polypeptide encoded by 
10 the cellobiohydrolase I encoding part of the nucleotide sequence present in Acremonium 

thermophilum, 

an amino acid sequence which has at least 80% identity with the polypeptide encoded by 
the cellobiohydrolase I encoding part of the nucleotide sequence present in Chaetomium 
thermophilum, 

15 an amino acid sequence which has at least 80% identity with the polypeptide encoded by 

the cellobiohydrolase I encoding part of the nucleotide sequence present in Scytalidium 

sp-, 

an amino acid sequence which has at least 80% identity with the polypeptide encoded by 
the cellobiohydrolase I encoding part of the nucleotide sequence present in Scytalidium 
20 thermophilum, 

an amino acid sequence which has at least 80% identity with the polypeptide encoded by 
the cellobiohydrolase I encoding part of the nucleotide sequence present in 
Thermoascus aurantiacus, 

an amino acid sequence which has at least 80% identity with the polypeptide encoded by 
25 the cellobiohydrolase I encoding part of the nucleotide sequence present in Thielavia 

australiensis, 

an amino acid sequence which has at least 70% identity with the polypeptide encoded by 
the cellobiohydrolase I encoding part of the nucleotide sequence present in Verticillium 
tenerum, 

30 an amino acid sequence which has at least 70% identity with the polypeptide encoded by 

the cellobiohydrolase I encoding part of the nucleotide sequence present in Neotermes 
castaneus, 

an amino acid sequence which has at least 80% identity with the polypeptide encoded by 
the cellobiohydrolase I encoding part of the nucleotide sequence present in 
35 Melanocarpus albomyces, 

an amino acid sequence which has at least 80% identity with the polypeptide encoded 
by the cellobiohydrolase I encoding part of the nucleotide sequence present in 
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Acremonium sp., 

an amino acid sequence which has at least 80% identity with the polypeptide encoded by 
the cellobiohydrolase I encoding part of the nucleotide sequence present in 
Chaetomidium pingtungium, 
5 an amino acid sequence which has at least 80% identity with the polypeptide encoded by 

the cellobiohydrolase I encoding part of the nucleotide sequence present in 
Sporotrichum pruinosum, 

an amino acid sequence which has at least 80% identity with the polypeptide encoded by 
the cellobiohydrolase I encoding part of the nucleotide sequence present in Diplodia 
10 gossypina, 

an amino acid sequence which has at least 80% identity with the polypeptide encoded by 
the cellobiohydrolase I encoding part of the nucleotide sequence present in Trichophaea 
saccata, 

an amino acid sequence which has at least 80% identity with the polypeptide encoded by 
15 the cellobiohydrolase I encoding part of the nucleotide sequence present in 

Myceliophthora thermophila, 

an amino acid sequence which has at least 80% identity with the polypeptide encoded by 
the cellobiohydrolase I encoding part of the nucleotide sequence present in Exidia 
glandulosa, 

20 an amino acid sequence which has at least 80% identity with the polypeptide encoded by 

the cellobiohydrolase I encoding part of the nucleotide sequence present in Xylaria 
hy poxy Ion , 

an amino acid sequence which has at least 80% identity with the polypeptide encoded by 
the cellobiohydrolase I encoding part of the nucleotide sequence present in Poitrasia 
25 circinans, 

an amino acid sequence which has at least 80% identity with the polypeptide encoded by 
the cellobiohydrolase I encoding part of the nucleotide sequence present in Coprinus 
cinereus, 

an amino acid sequence which has at least 80% identity with the polypeptide encoded by 
30 the cellobiohydrolase I encoding part of the nucleotide sequence present in 

Pseudoplectania nigrella, 

an amino acid sequence encoded by the cellobiohydrolase I encoding part of the 
nucleotide sequence present in Trichothecium roseum IFO 5372, 

an amino acid sequence encoded by the cellobiohydrolase I encoding part of the 
35 nucleotide sequence present in Humicola nigrescens CBS 819.73, 

an amino acid sequence encoded by the cellobiohydrolase I encoding part of the 
nucleotide sequence present in Cladorrhinum foecundissimum CBS 427.97, 
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an amino acid sequence encoded by the cellobiohydrolase I encoding part of the 
nucleotide sequence present in Diplodia gossypina CBS 247.96, 

an amino acid sequence encoded by the cellobiohydrolase I encoding part of the 

nucleotide sequence present in Myceliophthora thermophila CBS 1 17.65, 

an amino acid sequence encoded by the cellobiohydrolase I encoding part of the 

nucleotide sequence present in Rhizomucor pusillus CBS 109471, 

an amino acid sequence encoded by the cellobiohydrolase I encoding part of the 

nucleotide sequence present in Meripilus giganteus CBS 521 .95, 

an amino acid sequence encoded by the cellobiohydrolase I encoding part of the 
nucleotide sequence present in Exidia glandulosa CBS 2377.96, 

an amino acid sequence encoded by the cellobiohydrolase I encoding part of the 
nucleotide sequence present in Xylaria hypoxylon CBS 284.96, 

an amino acid sequence encoded by the cellobiohydrolase I encoding part of the 
nucleotide sequence present in Trichophaea saccata CBS 804.70, 
an amino acid sequence encoded by the cellobiohydrolase I encoding part of the 
nucleotide sequence present in Chaetomium sp., 

an amino acid sequence encoded by the cellobiohydrolase I encoding part of the 
nucleotide sequence present in Myceliophthora hinnulea, 

an amino acid sequence encoded by the cellobiohydrolase I encoding part of the 
nucleotide sequence present in Thielavia cf. microspora, 

an amino acid sequence encoded by the cellobiohydrolase I encoding part of the 
nucleotide sequence present in Aspergillus sp., 

an amino acid sequence encoded by the cellobiohydrolase I encoding part of the 
nucleotide sequence present in Scopulariopsis sp., 

an amino acid sequence encoded by the cellobiohydrolase I encoding part of the 
nucleotide sequence present in Fusarium sp., 

an amino acid sequence encoded by the cellobiohydrolase I encoding part of the 
nucleotide sequence present in Verticillium sp., and 

an amino acid sequence encoded by the cellobiohydrolase I encoding part of the 
nucleotide sequence present in Phytophthora infestans\ 

(c) a polypeptide comprising an amino acid sequence selected from the group consisting of: 
an amino acid sequence which has at least 80% identity with the polypeptide encoded by 
nucleotides 1 to 1578 of SEQ ID NO:1, 

an amino acid sequence which has at least 80% identity with the polypeptide encoded by 
nucleotides 1 to 1587 of SEQ ID NO:3, 

an amino acid sequence which has at least 80% identity with the polypeptide encoded by 
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nucleotides 1 to 1353 of SEQ ID NO:5, 

an amino acid sequence which has at least 80% 

nucleotides 1 to 1371 of SEQ ID NO:7, 

an amino acid sequence which has at least 80% 

nucleotides 1 to 1614 of SEQ ID NO:9, 

an amino acid sequence which has at least 70% 

nucleotides 1 to 1245 of SEQ ID NO:11, 

an amino acid sequence which has at least 70% 

nucleotides 1 to 1341 of SEQ ID NO: 13, 

an amino acid sequence which has at least 80% 

nucleotides 1 to 1356 of SEQ ID NO: 15, 

an amino acid sequence which has at least 80% 

nucleotides 1 to 1365 of SEQ ID NO:37, 

an amino acid sequence which has at least 80% 

nucleotides 1 to 1377 of SEQ ID NO: 39, 

an amino acid sequence which has at least 80% 

nucleotides 1 to 1353 of SEQ ID NO:41, 

an amino acid sequence which has at least 80% 

nucleotides 1 to 1341 of SEQ ID NO:43, 

an amino acid sequence which has at least 80% 

nucleotides 1 to 1584 of SEQ ID NO:45, 

an amino acid sequence which has at least 80% 

nucleotides 1 to 1368 of SEQ ID NO:47, 

an amino acid sequence which has at least 80% 

nucleotides 1 to 1395 of SEQ ID NO:49, 

an amino acid sequence which has at least 80% 

nucleotides 1 to 1383 of SEQ ID NO:51, 

an amino acid sequence which has at least 80% 

nucleotides 1 to 1353 of SEQ ID NO:53, 

an amino acid sequence which has at least 80% 

nucleotides 1 to 1599 of SEQ ID NO:55, 

an amino acid sequence which has at least 80% 

nucleotides 1 to 1383 of SEQ ID NO:57, 

an amino acid sequence which has at least 80% 

nucleotides 1 to 1578 of SEQ ID NO:59, and 

an amino acid sequence which has at least 80% 

nucleotides 1 to 1371 of SEQ ID NO:65; 
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identity with the polypeptide encoded by 
identity with the polypeptide encoded by 
identity with the polypeptide encoded by 
identity with the polypeptide encoded by 
dentity with the polypeptide encoded by 
dentity with the polypeptide encoded by 
dentity with the polypeptide encoded by 
dentity with the polypeptide encoded by 
dentity with the polypeptide encoded by 
dentity with the polypeptide encoded by 
dentity with the polypeptide encoded by 
dentity with the polypeptide encoded by 
dentity with the polypeptide encoded by 
dentity with the polypeptide encoded by 
dentity with the polypeptide encoded by 
dentity with the polypeptide encoded by 
dentity with the polypeptide encoded by 
dentity with the polypeptide encoded by 
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(d) a polypeptide which is encoded by a nucleotide sequence which hybridizes under high 
stringency conditions with a polynucleotide probe selected from the group consisting of: 
(i) the complementary strand of the nucleotides selected from the group consisting of: 



5 


nucleotides 1 


to 


1578 of SEQ ID NO:1, 




nucleotides 1 


to 


1587 of SEQ ID NO:3, 




nucleotides 1 


to 


1353 of SEQIDNO:5, 




nucleotides 1 


to 


1371 of SEQ ID NO:7, 




nucleotides 1 


to 


1614 of SEQ ID NO:9, 


10 


nucleotides 1 


to 


1245 ofSEQIDNO:11 




nucleotides 1 


to 


1341 of SEQ ID NO: 13 




nucleotides 1 


to 


1356 of SEQ ID NO: 15 




nucleotides 1 


to 


1365 of SEQ ID NO:37 




nucleotides 1 


to 


1377 of SEQ ID NO:39 


15 


nucleotides 1 


to 


1353 of SEQ IDNO:41 




nucleotides 1 


to 


1341 of SEQ ID NO:43 




nucleotides 1 


to 


1584 of SEQ ID NO:45 




nucleotides 1 


to 


1368 of SEQ ID NO:47 




nucleotides 1 


to 


1395 of SEQ ID NO:49 


20 


nucleotides 1 


to 


1383 of SEQ ID NO:51 




nucleotides 1 


to 


1353 of SEQ ID NO:53 




nucleotides 1 


to 


1599 of SEQ ID NO:55 




nucleotides 1 


to 


1383 of SEQ ID NO:57 




nucleotides 1 


to 


1578 of SEQ ID NO:59 


25 


nucleotides 1 


to 


1371 of SEQ ID NO:65 




(ii) the complementary strand of the nucleo 




nucleotides 1 


to 


500 of SEQIDNO:1, 




nucleotides 1 


to 


500 of SEQIDNO:3, 




nucleotides 1 


to 


500 of SEQIDNO:5, 


30 


nucleotides 1 


to 


500 of SEQ ID NO:7, 




nucleotides 1 


to 


500 of SEQ ID NO:9, 




nucleotides 1 


to 


500 of SEQ ID NO:11, 




nucleotides 1 


to 


500 of SEQ ID NO:13, 




nucleotides 1 


to 


500 of SEQ ID NO:15, 


35 


nucleotides 1 


to 


500 of SEQIDNO:37 t 




nucleotides 1 


to 


500 of SEQ ID NO:39, 




nucleotides 1 


to 


500 of SEQ ID NO:41, 



and 
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nucleot 


ides ' 


1 to 


500 


of 


SEQ 


ID 


NO:43 




nucleol 


:ides ' 


1 to 


500 


of 


SEQ 


ID 


NO:45 




nucleot 


:ides ' 


1 to 


500 


of 


SEQ 


ID 


NO:47 




nucleot 


ides ' 


1 to 


500 


of 


SEQ 


ID 


NO:49 


5 


nucleot 


ides ' 


1 to 


500 


of 


SEQ 


ID 


NO:51 




nucleot 


ides ' 


1 to 


500 


of 


SEQ 


ID 


NO:53 




nucleot 


ides ' 


I to 


500 


of 


SEQ 


ID 


NO:55 




nucleot 


ides ' 


1 to 


500 


of 


SEQ 


ID 


NO:57 




nucleot 


ides ' 


1 to 


500 


of 


SEQ 


ID 


NO:59 


10 


nucleot 


ides ' 


1 to 


500 


of 


SEQ 


ID 


NO:65 




nucleot 


ides ' 


1 to 


221 


of 


SEQ 


ID 


NO:17 




nucleot 


ides ' 


1 to 


239 


of 


SEQ 


ID 


NO:18 




nucleot 


ides ' 


1 to 


199 


of 


SEQ 


ID 


NO:19 




nucleot 


ides ' 


1 to 


191 


of 


SEQ 


ID 


NO:20 


15 


nucleot 


ides ' 


1 to 


232 


of 


SEQ 


ID 


NO:21, 




nucleot 


ides ' 


1 to 


467 


of 


SEQ 


ID 


NO:22, 




nucleot 


ides ' 


1 to 


534 


of 


SEQ 


ID 


NO:23, 




nucleot 


ides ' 


1 to 


563 


of 


SEQ 


ID 


NO:24, 




nucleot 


ides ' 


1 to 


218 


of 


SEQ 


ID 


NO:25, 


20 


nucleot 


ides ' 


1 to 


492 


of 


SEQ 


ID 


NO:26, 




nucleot 


ides ' 


1 to 


481 


of 


SEQ 


ID 


NO:27, 




nucleot 


ides 1 


1 to 


463 


of 


SEQ 


ID 


NO:28, 




nucleot 


ides 1 


1 to 


513 


of 


SEQ 


ID 


NO:29, 




nucleot 


ides ' 


1 to 


579 


of 


SEQ 


ID 


NO:30, 


25 


nucleot 


ides ' 


1 to 


514 


of 


SEQ 


ID 


NO:31, 




nucleot 


ides ' 


1 to 


477 


of 


SEQ 


ID 


NO:32, 




nucleot 


ides 1 


1 to 


500 


of 


SEQ 


ID 


NO:33, 




nucleot 


ides 1 


1 to 


470 


of 


SEQ 


ID 


NO:34, 




nucleot 


ides 1 


1 to 


491 


of 


SEQ 


ID 


NO:35, 


30 


nucleot 


ides 1 


1 to 


221 


of 


SEQ 


ID 


NO:36, 




nucleot 


ides 1 


1 to 


519 


of 


SEQ 


ID 


NO:61, 




nucleot 


ides 1 


1 to 


497 


of 


SEQ 


ID 


NO:62, 




nucleot 


ides 1 


1 to 


498 


of 


SEQ 


ID 


NO:63, 




nucleot 


ides 1 


1 to 


525 


of 


SEQ 


ID 


NO:64, 


35 


nucleot 


ides 1 


1 to 


951 


of 


SEQ 


ID 


NO:67; 



(iii) 



and 
and 

the complementary strand of the nucleotides selected from the group consisting of: 
nucleotides 1 to 200 of SEQ ID NO:1, 
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nucleotides 1 to 200 of SEQ ID NO:3, 

nucleotides 1 to 200 of SEQ ID NO:5, 

nucleotides 1 to 200 of SEQ ID NO:7, 

nucleotides 1 to 200 of SEQ ID NO:9, 
5 nucleotides 1 to 200 of SEQ ID NO:11, 

nucleotides 1 to 200 of SEQ ID NO: 13, 

nucleotides 1 to 200 of SEQ ID NO: 15, 

nucleotides 1 to 200 of SEQ ID NO:37, 

nucleotides 1 to 200 of SEQ ID NO:39, 
10 nucleotides 1 to 200 of SEQ ID NO:41 , 

nucleotides 1 to 200 of SEQ ID NO:43, 

nucleotides 1 to 200 of SEQ ID NO:45, 

nucleotides 1 to 200 of SEQ ID NO:47, 

nucleotides 1 to 200 of SEQ ID NO:49, 
15 nucleotides 1 to 200 of SEQ ID NO:51, 

nucleotides 1 to 200 of SEQ ID NO:53, 

nucleotides 1 to 200 of SEQ ID NO:55, 

nucleotides 1 to 200 of SEQ ID NO:57, 

nucleotides 1 to 200 of SEQ ID NO:59, and 
20 nucleotides 1 to 200 of SEQ ID NO:65; and 

(e) a fragment of (a), (b) or (c) that has cellobiohydrolase I activity. 

In a second aspect the present invention relates to a polynucleotide having a nucleotide 
25 sequence which encodes for the polypeptide of the invention. 

In a third aspect the present invention relates to a nucleic acid construct comprising the 
nucleotide sequence, which encodes for the polypeptide of the invention, operably linked to 
one or more control sequences that direct the production of the polypeptide in a suitable host. 
In a fourth aspect the present invention relates to a recombinant expression vector 
30 comprising the nucleic acid construct of the invention. 

In a fifth aspect the present invention relates to a recombinant host cell comprising the 
nucleic acid construct of the invention. 

In a sixth aspect the present invention relates to a method for producing a polypeptide of 
the invention, the method comprising: 
35 (a) cultivating a strain, which in its wild-type form is capable of producing the 

polypeptide, to produce the polypeptide; and 
(b) recovering the polypeptide. 

9 
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In a seventh aspect the present invention relates to a method for producing a 
polypeptide of the invention, the method comprising: 

(a) cultivating a recombinant host cell of the invention under conditions conducive for 
production of the polypeptide; and 
5 (b) recovering the polypeptide. 

In an eight aspect the present invention relates to a method for in-situ production of a 
polypeptide of the invention, the method comprising: 

(a) cultivating a recombinant host cell of the invention under conditions conducive for 
production of the polypeptide; and 
10 (b) contacting the polypeptide with a desired substrate without prior recovery of the 

polypeptide. 

Other aspects of the present invention will be apparent from the below description and 
from the appended claims. 

15 

Definitions 

Prior to discussing the present invention in further details, the following terms and 
conventions will first be defined: 

Substantially pure polypeptide: In the present context, the term "substantially pure 

20 polypeptide" means a polypeptide preparation which contains at the most 10% by weight of 
other polypeptide material with which it is natively associated (lower percentages of other 
polypeptide material are preferred, e.g. at the most 8% by weight, at the most 6% by weight, 
at the most 5% by weight, at the most 4% at the most 3% by weight, at the most 2% by 
weight, at the most 1% by weight, and at the most !4% by weight). Thus, it is preferred that 

25 the substantially pure polypeptide is at least 92% pure, i.e. that the polypeptide constitutes at 
least 92% by weight of the total polypeptide material present in the preparation, and higher 
percentages are preferred such as at least 94% pure, at least 95% pure, at least 96% pure, at 
least 96% pure, at least 97% pure, at least 98% pure, at least 99%, and at the most 99.5% 
pure. The polypeptides disclosed herein are preferably in a substantially pure form. In 

30 particular, it is preferred that the polypeptides disclosed herein are in "essentially pure form", 
i.e. that the polypeptide preparation is essentially free of other polypeptide material with which 
it is natively associated. This can be accomplished, for example, by preparing the polypeptide 
by means of well-known recombinant methods. Herein, the term "substantially pure 
polypeptide" is synonymous with the terms "isolated polypeptide" and "polypeptide in isolated 

35 form". 

Cellobiohydrolase I activity: The term "cellobiohydrolase I activity" is defined herein as a 
cellulose 1,4-beta-cellobiosidase (also referred to as Exo-glucanase, Exo-cellobiohydrolase or 

10 
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1,4-beta-cellobiohydrolase) activity, as defined in the enzyme class EC 3.2.1.91, which 
catalyzes the hydrolysis of 1 ,4-beta-D-glucosidic linkages in cellulose and cellotetraose, 
releasing cellobiose from the non-reducing ends of the chains. 

For purposes of the present invention, cellobiohydrolase I activity may be determined 
5 according to the procedure described in Example 2. 

In an embodiment, cellobiohydrolase I activity may be determined according to the 
procedure described in Deshpande MV et al., Methods in Enzymology, pp. 126-130 (1988): 
"Selective Assay for Exo-1,4-Beta-Glucanases". According to this procedure, one unit of 
cellobiohydrolase I activity (agluconic bond cleavage activity) is defined as 1.0 pinole of p- 

10 nitrophenol produced per minute at 50°C, pH 5.0. 

The polypeptides of the present invention should preferably have at least 20% of the 
cellobiohydrolase I activity of a polypeptide consisting of an amino acid sequence selected 
from the group consisting of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ 
ID NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:38, SEQ ID NO:40, 

15 SEQ ID NO:42, SEQ ID NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50, SEQ ID 
NO:52, SEQ ID NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:60, and SEQ ID NO:66. 
In a particular preferred embodiment, the polypeptides should have at least 40%, such as at 
least 50%, preferably at least 60%, such as at least 70%, more preferably at least 80%, such 
as at least 90%, most preferably at least 95%, such as about or at least 100% of the 

20 cellobiohydrolase I activity of the polypeptide consisting of the amino acid sequence selected 
from the group consisting of amino acids 1 to 526 of SEQ ID NO:2, amino acids 1 to 529 of 
SEQ ID NO:4, amino acids 1 to 451 of SEQ ID NO:6, amino acids 1 to 457 of SEQ ID NO:8, 
amino acids 1 to 538 of SEQ ID NO:10, amino acids 1 to 415 of SEQ ID NO:12, amino acids 1 
to 447 of SEQ ID NO:14, amino acids 1 to 452 of SEQ ID NO:16, amino acids 1 to 454 of SEQ 

25 ID NO:38, amino acids 1 to 458 of SEQ ID NO:40, amino acids 1 to 450 of SEQ ID NO:42, 
amino acids 1 to 446 of SEQ ID NO:44, amino acids 1 to 527 of SEQ ID NO:46, amino acids 1 
to 455 of SEQ ID NO:48, amino acids 1 to 464 of SEQ ID NO:50, amino acids 1 to 460 of SEQ 
ID NO:52, amino acids 1 to 450 of SEQ ID NO:54, amino acids 1 to 532 of SEQ ID NO:56, 
amino acids 1 to 460 of SEQ ID NO:58, amino acids 1 to 525 of SEQ ID NO:60, and amino 

30 acids 1 to 456 of SEQ ID NO:66. 

Identity: In the present context, the homology between two amino acid sequences or 
between two nucleotide sequences is described by the parameter "identity". 

For purposes of the present invention, the degree of identity between two amino acid 
sequences is determined by using the program FASTA included in version 2. Ox of the FASTA 

35 program package (see W. R. Pearson and D. J. Lipman (1988), "Improved Tools for Biological 
Sequence Analysis", PNAS 85:2444-2448; and W. R. Pearson (1990) "Rapid and Sensitive 
Sequence Comparison with FASTP and FASTA", Methods in Enzymology 183:63-98). The 
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scoring matrix used was BLOSUM50, gap penalty was -12, and gap extension penalty was -2. 

The degree of identity between two nucleotide sequences is determined using the same 
algorithm and software package as described above. The scoring matrix used was the identity 
matrix, gap penalty was -16, and gap extension penalty was -4. 
5 Fragment: When used herein, a "fragment" of a sequence selected from the group 

consisting of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ 
ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:38, SEQ ID NO:40, SEQ ID NO:42, 
SEQ ID NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ ID 
NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:60, and SEQ ID NO:66 is a polypeptide 

10 having one or more amino acids deleted from the amino and/or carboxyl terminus of this 
amino acid sequence. Preferably, a fragment is a polypeptide having the amino acid 
sequence deleted corresponding to the "cellulose-binding domain" and/or the "linker domain" 
of Trichoderma reesei cellobiohydrolase I as described in SWISS-PROT accession number 
P00725. More preferably, a fragment comprises the amino acid sequence corresponding to 

15 the "catalytic domain" of Trichoderma reesei cellobiohydrolase I as described in SWISS-PROT 
accession number P00725. Most preferably, a fragment contains at least 434 amino acid 
residues, e.g., the amino acid residues selected from the group consisting of amino acids 1 to 
434 of SEQ ID NO:2, amino acids 1 to 434 of SEQ ID NO:4, amino acids 1 to 434 of SEQ ID 
NO:6, amino acids 1 to 434 of SEQ ID NO:8, amino acids 1 to 434 of SEQ ID NO: 10, amino 

20 acids 1 to 434 of SEQ ID NO:14, amino acids 1 to 434 of SEQ ID NO:16, amino acids 1 to 434 
of SEQ ID NO:38, amino acids 1 to 434 of SEQ ID NO:40, amino acids 1 to 434 of SEQ ID 
NO:42, amino acids 1 to 434 of SEQ ID NO:44, amino acids 1 to 434 of SEQ ID NO:46, amino 
acids 1 to 434 of SEQ ID NO:48, amino acids 1 to 434 of SEQ ID NO:50, amino acids 1 to 434 
of SEQ ID NO:52, amino acids 1 to 434 of SEQ ID NO:54, amino acids 1 to 434 of SEQ ID 

25 NO:56, amino acids 1 to 434 of SEQ ID NO:58, amino acids 1 to 434 of SEQ ID NO:60, and 
amino acids 1 to 434 of SEQ ID NO:66. In particular, a fragment contains at least 215 amino 
acid residues, e.g., the amino acid residues selected from the group consisting of amino acids 
200 to 434 of SEQ ID NO:2, amino acids 200 to 434 of SEQ ID NO:4, amino acids 200 to 434 
of SEQ ID IMO:6, amino acids 200 to 434 of SEQ ID NO:8, amino acids 200 to 434 of SEQ ID 

30 NO:10, amino acids 200 to 415 of SEQ ID NO:12, amino acids 200 to 434 of SEQ ID NO:14, 
amino acids 200 to 434 of SEQ ID NO:16, amino acids 200 to 434 of SEQ ID NO:38, amino 
acids 200 to 434 of SEQ ID NO:40, amino acids 200 to 434 of SEQ ID NO:42, amino acids 
200 to 434 of SEQ ID NO:44, amino acids 200 to 434 of SEQ ID NO:46, amino acids 200 to 
434 of SEQ ID NO:48, amino acids 200 to 434 of SEQ ID NO:50, amino acids 200 to 434 of 

35 SEQ ID NO:52, amino acids 200 to 434 of SEQ ID NO:54, amino acids 200 to 434 of SEQ ID 
NO:56, amino acids 200 to 434 of SEQ ID NO:58, amino acids 200 to 434 of SEQ ID NO:60, 
and amino acids 200 to 434 of SEQ ID NO:66. 



WO 03/000941 PCT/DK02/00429 

Allelic variant: In the present context, the term "allelic variant" denotes any of two or 
more alternative forms of a gene occupying the same chromosomal locus. Allelic variation 
arises naturally through mutation, and may result in polymorphism within populations. Gene 
mutations can be silent (no change in the encoded polypeptide) or may encode polypeptides 
5 having altered amino acid sequences. An allelic variant of a polypeptide is a polypeptide 
encoded by an allelic variant of a gene. 

Substantially pure polynucleotide: The term "substantially pure polynucleotide" as used 
herein refers to a polynucleotide preparation, wherein the polynucleotide has been removed 
from its natural genetic milieu, and is thus free of other extraneous or unwanted coding 

10 sequences and is in a form suitable for use within genetically engineered protein production 
systems. Thus, a substantially pure polynucleotide contains at the most 10% by weight of 
other polynucleotide material with which it is natively associated (lower percentages of other 
polynucleotide material are preferred, e.g. at the most 8% by weight, at the most 6% by 
weight, at the most 5% by weight, at the most 4% at the most 3% by weight, at the most 2% 

15 by weight, at the most 1% by weight, and at the most 14% by weight). A substantially pure 
polynucleotide may, however, include naturally occurring 5' and 3' untranslated regions, such 
as promoters and terminators. It is preferred that the substantially pure polynucleotide is at 
least 92% pure, i.e. that the polynucleotide constitutes at least 92% by weight of the total 
polynucleotide material present in the preparation, and higher percentages are preferred such 

20 as at least 94% pure, at least 95% pure, at least 96% pure, at least 96% pure, at least 97% 
pure, at least 98% pure, at least 99%, and at the most 99.5% pure. The polynucleotides 
disclosed herein are preferably in a substantially pure form. In particular, it is preferred that the 
polynucleotides disclosed herein are in "essentially pure form", i.e. that the polynucleotide 
preparation is essentially free of other polynucleotide material with which it is natively 

25 associated. Herein, the term "substantially pure polynucleotide" is synonymous with the terms 
"isolated polynucleotide" and "polynucleotide in isolated form". 

Modification(s): In the context of the present invention the term "modification(s)" is 
intended to mean any chemical modification of a polypeptide consisting of an amino acid 
sequence selected from the group consisting of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, 

30 SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID 
NO:38, SEQ ID NO:40, SEQ ID NO:42, SEQ ID NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ 
ID NO:50, SEQ ID NO:52, SEQ ID NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:60, 
and SEQ ID NO:66, as well as genetic manipulation of the DNA encoding that polypeptide. 
The modification(s) can be replacement(s) of the amino acid side chain(s), substitution(s), 

35 deletion(s) and/or insertions(s) in or at the amino acid(s) of interest. 

Artificial variant: When used herein, the term "artificial variant" means a polypeptide 
having cellobiohydrolase I activity, which has been produced by an organism which is 
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expressing a modified gene as compared to SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, SEQ 
ID NO:7, SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:37, SEQ 
ID NO:39, SEQ ID NO:41, SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, 
SEQ ID NO:51, SEQ ID NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID NO:59, or SEQ ID 
5 NO:65. The modified gene, from which said variant is produced when expressed in a suitable 
host, is obtained through human intervention by modification of a nucleotide sequence 
selected from the group consisting of SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, SEQ ID 
NO:7, SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:37, SEQ ID 
NO:39, SEQ ID NO:41, SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ 
10 ID NO:51, SEQ ID NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID NO:59, and SEQ ID 
NO:65. 

cDNA: The term "cDNA" when used in the present context, is intended to cover a DNA 
molecule which can be prepared by reverse transcription from a mature, spliced, mRNA 
molecule derived from a eukaryotic cell. cDNA lacks the intron sequences that are usually 

15 present in the corresponding genomic DNA. The initial, primary RNA transcript is a precursor 
to mRNA and it goes through a series of processing events before appearing as mature 
spliced mRNA. These events include the removal of intron sequences by a process called 
splicing. When cDNA is derived from mRNA it therefore lacks intron sequences. 

Nucleic acid construct: When used herein, the term "nucleic acid construct" means a 

20 nucleic acid molecule, either single- or double-stranded, which is isolated from a naturally 
occurring gene or which has been modified to contain segments of nucleic acids in a manner 
that would not otherwise exist in nature. The term nucleic acid construct is synonymous with 
the term "expression cassette" when the nucleic acid construct contains the control sequences 
required for expression of a coding sequence of the present invention. 

25 Control sequence: The term "control sequences" is defined herein to include all 

components, which are necessary or advantageous for the expression of a polypeptide of the 
present invention. Each control sequence may be native or foreign to the nucleotide 
sequence encoding the polypeptide. Such control sequences include, but are not limited to, a 
leader, polyadenylation sequence, propeptide sequence, promoter, signal peptide sequence, 

30 and transcription terminator. At a minimum, the control sequences include a promoter, and 
transcriptional and translational stop signals. The control sequences may be provided with 
linkers for the purpose of introducing specific restriction sites facilitating ligation of the control 
sequences with the coding region of the nucleotide sequence encoding a polypeptide. 

Qperably linked: The term "operably linked" is defined herein as a configuration in which 

35 a control sequence is appropriately placed at a position relative to the coding sequence of the 
DNA sequence such that the control sequence directs the expression of a polypeptide. 

Coding sequence: When used herein the term "coding sequence" is intended to cover a 

14 
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nucleotide sequence, which directly specifies the amino acid sequence of its protein product. 
The boundaries of the coding sequence are generally determined by an open reading frame, 
which usually begins with the ATG start codon. The coding sequence typically include DNA, 
cDNA, and recombinant nucleotide sequences. 
5 Expression: In the present context, the term "expression" includes any step involved in 

the production of the polypeptide including, but not limited to, transcription, post-transcriptional 
modification, translation, post-translational modification, and secretion. 

Expression vector : In the present context, the term "expression vector" covers a DNA 
molecule, linear or circular, that comprises a segment encoding a polypeptide of the invention, 
10 and which is operably linked to additional segments that provide for its transcription. 

Host cell: The term "host cell", as used herein, includes any cell type which is 
susceptible to transformation with a nucleic acid construct. 

The terms "polynucleotide probe", "hybridization" as well as the various stringency 
conditions are defined in the section entitled "Polypeptides Having Cellobiohydrolase I 
15 Activity". 

Thermostability: The term "thermostability", as used herein, is measured as described in 
Example 2. 

Detailed Description of the Invention 

20 

Polypeptides Having Cellobiohydrolase I Activity 

In a first embodiment, the present invention relates to polypeptides having 
cellobiohydrolase I activity and where the polypeptides comprises, preferably consists of, an 
amino acid sequence which has a degree of identity to an amino acid sequence selected from 

25 the group consisting of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID 
NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:38, SEQ ID NO:40, SEQ 
ID NO:42, SEQ ID NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, 
SEQ ID NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:60, and SEQ ID NO:66, (i.e., the 
mature polypeptide) of at least 65%, preferably at least 70%, e.g. at least 75%, more 

30 preferably at least 80%, such as at least 85%, even more preferably at least 90%, most 
preferably at least 95%, e.g. at least 96%, such as at least 97%, and even most preferably at 
least 98%, such as at least 99% (hereinafter "homologous polypeptides"). In an interesting 
embodiment, the amino acid sequence differs by at the most ten amino acids (e.g. by ten 
amino acids), in particular by at the most five amino acids (e.g. by five amino acids), such as 

35 by at the most four amino acids (e.g. by four amino acids), e.g. by at the most three amino 
acids (e.g. by three amino acids) from an amino acid sequence selected from the group 
consisting of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ 

15 
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ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:38, SEQ ID NO:40, SEQ ID NO:42, 
SEQ ID NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ ID 
NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:60, and SEQ ID NO:66. In a particular 
interesting embodiment, the amino acid sequence differs by at the most two amino acids (e.g. 
5 by two amino acids), such as by one amino acid from an amino acid sequence selected from 
the group consisting of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID 
NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:38, SEQ ID NO:40, SEQ 
ID NO:42, SEQ ID NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, 
SEQ ID NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:60, and SEQ ID NO:66. 

10 Preferably, the polypeptides of the present invention comprise an amino acid sequence 

selected from the group consisting of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID 
NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:38, SEQ 
ID NO:40, SEQ ID NO:42, SEQ ID NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50, 
SEQ ID NO:52, SEQ ID NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:60, and SEQ ID 

15 NO:66; an allelic variant thereof; or a fragment thereof that has cellobiohydrolase I activity. In 
another preferred embodiment, the polypeptide of the present invention consists of an amino 
acid sequence selected from the group consisting of SEQ ID NO:2, SEQ ID NO:4, SEQ ID 
NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID 
NO:38, SEQ ID NO:40, SEQ ID NO:42, SEQ ID NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ 

20 ID NO:50, SEQ ID NO:52, SEQ ID NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:60, 
and SEQ ID NO:66. 

The polypeptide of the invention may be a wild-type cellobiohydrolase I identified and 
isolated from a natural source. Such wild-type polypeptides may be specifically screened for 
by standard techniques known in the art, such as molecular screening as described in 

25 Example 1. Furthermore, the polypeptide of the invention may be prepared by the DNA 
shuffling technique, such as described in J.E. Ness et al. Nature Biotechnology 17, 893-896 
(1999). Moreover, the polypeptide of the invention may be an artificial variant which comprises, 
preferably consists of, an amino acid sequence that has at least one substitution, deletion 
and/or insertion of an amino acid as compared to an amino acid sequence selected from the 

30 group consisting of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, 
SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:38, SEQ ID NO:40, SEQ ID 
NO:42, SEQ ID NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ 
ID NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:60, and SEQ ID NO:66. Such artificial 
variants may be constructed by standard techniques known in the art, such as by site- 

35 directed/random mutagenesis of the polypeptide comprising an amino acid sequence selected 
from the group consisting of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ 
ID NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:38, SEQ ID NO:40, 
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SEQ ID NO:42, SEQ ID NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50, SEQ ID 
NO:52, SEQ ID NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:60, and SEQ ID NO:66. 
In one embodiment of the invention, amino acid changes (in the artificial variant as well as in 
wild-type polypeptides) are of a minor nature, that is conservative amino acid substitutions that 
5 do not significantly affect the folding and/or activity of the protein; small deletions, typically of 
one to about 30 amino acids; small amino- or carboxyl-terminal extensions, such as an amino- 
terminal methionine residue; a small linker peptide of up to about 20-25 residues; or a small 
extension that facilitates purification by changing net charge or another function, such as a 
poly-histidine tract, an antigenic epitope or a binding domain. 

10 Examples of conservative substitutions are within the group of basic amino acids 

(arginine, lysine and histidine), acidic amino acids (glutamic acid and aspartic acid), polar 
amino acids (glutamine and asparagine), hydrophobic amino acids (leucine, isoleucine, valine 
and methionine), aromatic amino acids (phenylalanine, tryptophan and tyrosine), and small 
amino acids (glycine, alanine, serine and threonine). Amino acid substitutions which do not 

15 generally alter the specific activity are known in the art and are described, for example, by H. 
Neurath and R.L. Hill, 1979, In, The Proteins, Academic Press, New York. The most 
commonly occurring exchanges are Ala/Ser, Val/lle, Asp/Glu, Thr/Ser, Ala/Gly, Ala/Thr, 
Ser/Asn, AlaA/al, Ser/Gly, Tyr/Phe, Ala/Pro, Lys/Arg, Asp/Asn, Leu/lle, LeuA/al, Ala/Glu, and 
Asp/Gly as well as these in reverse. 

20 In an interesting embodiment of the invention, the amino acid changes are of such a 

nature that the physico-chemical properties of the polypeptides are altered. For example, 
amino acid changes may be performed, which improve the thermal stability of the polypeptide, 
which alter the substrate specificity, which changes the pH optimum, and the like. 

Preferably, the number of such substitutions, deletions and/or insertions as compared to 

25 an amino acid sequence selected from the group consisting of SEQ ID NO:2, SEQ ID NO:4, 
SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, 
SEQ ID NO:38, SEQ ID NO:40, SEQ ID NO:42, SEQ ID NO:44, SEQ ID NO:46, SEQ ID 
NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ ID NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ 
ID NO:60, and SEQ ID NO:66 is at the most 10, such as at the most 9, e.g. at the most 8, 

30 more preferably at the most 7, e.g. at the most 6, such as at the most 5, most preferably at the 
most 4, e.g. at the most 3, such as at the most 2, in particular at the most 1. 

The present inventors have isolated nucleotide sequences encoding polypeptides having 
cellobiohydrolase I activity from the microorganisms selected from the group consisting of 
Acremonium thermophilum, Chaetomium thermophilum, Scytalidium sp., Scytalidium 

35 thermophilum, Thermoascus aurantiacus, Thielavia australiensis, Verticillium tenerum, 
Melanocarpus albomyces, Poitrasia circinans, Cophnus cinereus, Trichothecium roseum, 
Humicola nigrescens, Cladorrhinum foecundissimum, Diplodia gossypina, Myceliophthora 

17 
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thermophila, Rhizomucor pusillus, Meripilus giganteus, Exidia glandulosa, Xylaria hypoxylon, 
Trichophaea saccata, Acremonium sp., Chaetomium sp., Chaetomidium pingtungium, 
Myceliophthora thermophila, Myceliophthora hinnulea, Sporotrichum pruinosum, Thielavia cf. 
microspora, Aspergillus sp., Scopulariopsis sp., Fusarium sp., Verticillium sp., 
5 Pseudoplectania nigrella, and Phytophthora infestans; and from the gut of the termite larvae 
Neotermes castaneus. Thus, in a second embodiment, the present invention relates to 
polypeptides comprising an amino acid sequence which has at least 65% identity with the 
polypeptide encoded by the cellobiohydrolase I encoding part of the nucleotide sequence 
present in an organism selected from the group consisting of Acremonium thermophilum, 

10 Chaetomium thermophilum, Scytalidium sp., Scytalidium thermophilum, Thermoascus 
aurantiacus, Thielavia australiensis, Verticillium tenerum, Neotermes castaneus, 
Melanocarpus albomyces, Poitrasia circinans, Coprinus cinereus, Trichothecium roseum IFO 
5372, Humicola nigrescens CBS 819.73, Cladorrhinum foecundissimum CBS 427.97, Diplodia 
gossypina CBS 247.96, Myceliophthora thermophila CBS 117.65, Rhizomucor pusillus CBS 

15 109471, Meripilus giganteus CBS 521.95, Exidia glandulosa CBS 2377.96, Xylaria hypoxylon 
CBS 284.96, Trichophaea saccata CBS 804.70, Acremonium sp., Chaetomium sp., 
Chaetomidium pingtungium, Myceliophthora thermophila, Myceliophthora hinnulea, 
Sporotrichum pruinosum, Thielavia cf. microspora, Aspergillus sp., Scopulariopsis sp., 
Fusarium sp., Verticillium sp., Pseudoplectania nigrella, and Phytophthora infestans. In an 

20 interesting embodiment of the invention, the polypeptide comprises an amino acid sequence 
which has at least 70%, e.g. at least 75%, preferably at least 80%, such as at least 85%, more 
preferably at least 90%, most preferably at least 95%, e.g. at least 96%, such as at least 97%, 
and even most preferably at least 98%, such as at least 99% identity with the polypeptide 
encoded by the cellobiohydrolase I encoding part of the nucleotide sequence present in an 

25 organism selected from the group consisting of Acremonium thermophilum, Chaetomium 
thermophilum, Scytalidium sp., Scytalidium thermophilum, Thermoascus aurantiacus, 
Thielavia australiensis, Verticillium tenerum, Neotermes castaneus, Melanocarpus albomyces, 
Poitrasia circinans, Coprinus cinereus, Trichothecium roseum IFO 5372, Humicola nigrescens 
CBS 819.73, Cladorrhinum foecundissimum CBS 427.97, Diplodia gossypina CBS 247.96, 

30 Myceliophthora thermophila CBS 117.65, Rhizomucor pusillus CBS 109471, Meripilus 
giganteus CBS 521.95, Exidia glandulosa CBS 2377.96, Xylaria hypoxylon CBS 284.96, 
Trichophaea saccata CBS 804.70, Acremonium sp., Chaetomium sp., Chaetomidium 
pingtungium, Myceliophthora thermophila, Myceliophthora hinnulea, Sporotrichum pruinosum, 
Thielavia cf. microspora, Aspergillus sp., Scopulariopsis sp., Fusarium sp., Verticillium sp., 

35 Pseudoplectania nigrella, and Phytophthora infestans (hereinafter "homologous 
polypeptides"). In an interesting embodiment, the amino acid sequence differs by at the most 
ten amino acids (e.g. by ten amino acids), in particular by at the most five amino acids (e.g. by 
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five amino acids), such as by at the most four amino acids (e.g. by four amino acids), e.g. by 
at the most three amino acids (e.g. by three amino acids) from the polypeptide encoded by the 
cellobiohydrolase I encoding part of the nucleotide sequence present in an organism selected 
from the group consisting of Acremonium thermophilum, Chaetomium thermophilum, 
5 Scytalidium sp., Scytalidium thermophilum, Thermoascus aurantiacus, Thielavia australiensis, 
Verticillium tenerum, Neotermes castaneus, Melanocarpus albomyces, Poitrasia circinans, 
Coprinus cinereus, Trichothecium roseum IFO 5372, Humicola nigrescens CBS 819.73, 
Cladorrhinum foecundissimum CBS 427.97, Diplodia gossypina CBS 247.96, Myceliophthora 
thermophila CBS 117.65, Rhizomucor pusillus CBS 109471, Meripilus giganteus CBS 521.95, 

10 Exidia glandulosa CBS 2377.96, Xylaria hypoxylon CBS 284.96, Trichophaea saccata CBS 
804.70, Acremonium sp., Chaetomium sp., Chaetomidium pingtungium, Myceliophthora 
thermophila, Myceliophthora hinnulea, Sporotrichum pruinosum, Thielavia cf. microspora, 
Aspergillus sp., Scopulariopsis sp., Fusarium sp., Verticillium sp., Pseudoplectania nigrella, 
and Phytophthora infestans. In a particular interesting embodiment, the amino acid sequence 

15 differs by at the most two amino acids (e.g. by two amino acids), such as by one amino acid 
from the polypeptide encoded by the cellobiohydrolase I encoding part of the nucleotide 
sequence present in an organism selected from the group consisting of Acremonium 
thermophilum, Chaetomium thermophilum, Scytalidium sp., Scytalidium thermophilum, 
Thermoascus aurantiacus, Thielavia australiensis, Verticillium tenerum, Neotermes castaneus, 

20 Melanocarpus albomyces, Poitrasia circinans, Coprinus cinereus, Trichothecium roseum IFO 
5372, Humicola nigrescens CBS 819.73, Cladorrhinum foecundissimum CBS 427.97, Diplodia 
gossypina CBS 247.96, Myceliophthora thermophila CBS 117.65, Rhizomucor pusillus CBS 
109471, Meripilus giganteus CBS 521.95, Exidia glandulosa CBS 2377.96, Xylaria hypoxylon 
CBS 284.96, Trichophaea saccata CBS 804.70, Acremonium sp., Chaetomium sp., 

25 Chaetomidium pingtungium, Myceliophthora thermophila, Myceliophthora hinnulea, 
Sporotrichum pruinosum, Thielavia cf. microspora, Aspergillus sp., Scopulariopsis sp., 
Fusarium sp., Verticillium sp., Pseudoplectania nigrella, and Phytophthora infestans. 

Preferably, the polypeptides of the present invention comprise the amino acid sequence 
of the polypeptide encoded by the cellobiohydrolase I encoding part of the nucleotide 

30 sequence inserted into a plasmid present in a deposited microorganism selected from the 
group consisting of CGMCC No. 0584, CGMCC No. 0581, CGMCC No. 0585, CGMCC No. 
0582, CGMCC No. 0583, CBS 109513, DSM 14348, CGMCC No. 0580, DSM 15064, DSM 
15065, DSM 15066, DSM 15067, CGMCC No. 0747, CGMCC No. 0748, CGMCC No. 0749, 
and CGMCC No. 0750. In another preferred embodiment, the polypeptide of the present 

35 invention consists of the amino acid sequence of the polypeptide encoded by the 
cellobiohydrolase I encoding part of the nucleotide sequence inserted into a plasmid present in 
a deposited microorganism selected from the group consisting of CGMCC No. 0584, CGMCC 
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No. 0581, CGMCC No. 0585, CGMCC No. 0582, CGMCC No. 0583, CBS 109513, DSM 

14348, and CGMCC No. 0580, DSM 15064, DSM 15065, DSM 15066, DSM 15067, CGMCC 

No. 0747, CGMCC No. 0748, CGMCC No. 0749, and CGMCC No. 0750. 

In a similar way as described above, the polypeptide of the invention may be an artificial 
5 variant which comprises, preferably consists of, an amino acid sequence that has at least one 

substitution, deletion and/or insertion of an amino acid as compared to the amino acid 

sequence encoded by the cellobiohydrolase I encoding part of the nucleotide sequence 

inserted into a plasmid present in a deposited microorganism selected from the group 

consisting of CGMCC No. 0584, CGMCC No. 0581, CGMCC No. 0585, CGMCC No. 0582, 
10 CGMCC No. 0583, CBS 109513, DSM 14348, and CGMCC No. 0580, DSM 15064, DSM 

15065, DSM 15066, DSM 15067, CGMCC No. 0747, CGMCC No. 0748, CGMCC No. 0749, 

and CGMCC No. 0750. 

In a third embodiment, the present invention relates to polypeptides having 

cellobiohydrolase I activity which are encoded by nucleotide sequences which hybridize under 
15 very low stringency conditions, preferably under low stringency conditions, more preferably 

under medium stringency conditions, more preferably under medium-high stringency 

conditions, even more preferably under high stringency conditions, and most preferably under 

very high stringency conditions with a polynucleotide probe selected from the group consisting 

of (i) the complementary strand of the nucleotides selected from the group consisting of: 
20 nucleotides 1 to 1578 of SEQ ID NO:1, 

nucleotides 1 to 1587 of SEQ ID NO: 3, 

nucleotides 1 to 1353 of SEQ ID NO:5, 

nucleotides 1 to 1371 of SEQ ID NO:7, 

nucleotides 1 to 1614 of SEQ ID NO:9, 
25 nucleotides 1 to 1245 of SEQ ID NO:11, 

nucleotides 1 to 1341 of SEQ ID NO:13, 

nucleotides 1 to 1356 of SEQ ID NO: 15, 

nucleotides 1 to 1365 of SEQ ID NO:37, 

nucleotides 1 to 1377 of SEQ ID NO: 39, 
30 nucleotides 1 to 1 353 of SEQ ID NO:41 , 

nucleotides 1 to 1341 of SEQ ID NO:43, 

nucleotides 1 to 1584 of SEQ ID NO:45, 

nucleotides 1 to 1368 of SEQ ID NO:47, 

nucleotides 1 to 1395 of SEQ ID NO:49, 
35 nucleotides 1 to 1383 of SEQ ID NO:51 , 

nucleotides 1 to 1353 of SEQ ID NO:53, 

nucleotides 1 to 1599 of SEQ ID NO:55, 
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o 1383 of SEQ ID NO:57, 
o 1578 of SEQ ID NO:59, and 
o 1371 of SEQ ID NO:65; 

mentary strand of the nucleotides selected from the group consisting of 
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In another embodiment, the present invention relates to polypeptides having 
cellobiohydrolase I activity which are encoded by the cellobiohydrolase I encoding part of the 
nucleotide sequence present in a microorganism selected from the group consisting of: 
a microorganism belonging to Zygomycota, preferably belonging to the Mucorales, more 
5 preferably belonging to the family Mucoraceae, most preferably belonging to the genus 
Rhizomucor (e.g. Rhizomucor pusillus), or the family Choanephoraceae, most preferably 
belonging to the genus Poitrasia (e.g. Poitrasia circinans), 

a microorganism belonging to the Oomycetes, preferably to the order Pythiales, more 
preferably to the family Pythiaceae, most preferably to the genus Phytophthora (e.g. 
1 0 Phytophthora infestans) , 

a microorganism belonging to Auriculariales (an order of the Basidiomycota, 
Hymenomycetes), preferably belonging to the family Exidiaceae, more preferably belonging to 
the genus Exidia (e.g. Exidia glandulosa), 

a microorganism belonging to Xylariales (an order of the Ascomycota, Sordariomycetes), 
15 preferably belonging to the family Xylariaceae, more preferably belonging to the genus Xylaria 
(e.g. Xylaria hy poxy I on), 

a microorganism belonging to Dothideales (an order of the Ascomycota, Dothideomycetes), 
preferably belonging to the family Dothideaceae, more preferably belonging to the genus 
Diplodia (e.g. Diplodia gossypina), 
20 a microorganism belonging to Pezizales (an order of the Ascomycota), preferably belonging to 
the family Pyronemataceae, more preferably belonging to the genus Trichophaea (e.g. 
Trichophaea saccata), or the family Sarcosomataceae, more preferably belonging to the 
genus Pseudoplectania (e.g. Pseudoplectania nigrella), 

a microorganism belonging to the family Rigidiporaceae (under Basidiomycota, 
25 Hymenomycetes, Hymenomycetales), more preferably belonging to the genus Meripilus (e.g. 
Meripilus gig ante us), 

a microorganism belonging to the family Meruliaceae (under Basidiomycota, Hymenomycetes, 
Sterealesales), more preferably belonging to the genus Sporothrichum (Sporothrichum sp.), 
a microorganism belonging to the family Agaricaceae (under Basidiomycota, Hymenomycetes, 
30 Agaricales), more preferably belonging to the genus Coprinus (e.g. Coprinus cinereus), 

a microorganism belonging to the family Hypocreaceae (under Ascomycota, Sordariomycetes, 
Hypocreales), more preferably belonging to the genus Acremonium (e.g. Acremonium 
thermophilum', Acremonium sp.) or the (mitosporic) genus Verticillium (e.g. Verticillium 
tenerum), 

35 a microorganism belonging to the genus Cladorrhinum (under Ascomycota, Sordariomycetes, 
Sordariales, Sordariaceae) e.g. Cladorrhinum foecundissimum, 

a microorganism belonging to the genus Myceliophthora (under Ascomycota, 
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Sordariomycetes, Sordariales, Sordariaceae) e.g. Myceliophthora thermophila or 
Myceliophthora hinnulae, 

a microorganism belonging to the genus Chaetomium (under Ascomycota, Sordariomycetes, 
Sordariales, Chaetomiaceae) e.g. Chaetomium thermophilum, 
5 a microorganism belonging to the genus Chaetomidium (under Ascomycota, Sordariomycetes, 
Sordariales, Chaetomiaceae) e.g. Chaetomidium pingtungium, 

a microorganism belonging to the genus Thielavia (under Ascomycota, Sordariomycetes, 
Sordariales, Chaetomiaceae) e.g. Thielavia australiensis or Thielavia microspora, 
a microorganism belonging to the genus Thermoascus (under Ascomycota, Eurotiomycetes, 
10 Eurotiales, Trichocomoaceae) e.g. Thermoascus aurantiacus, 

a microorganism belonging to the genus Trichothecium (mitosporic Ascomycota) e.g. 
Trichothecium roseum, and 

a microorganism belonging to the species Humicola nigrescens. 

A nucleotide sequence selected from the group consisting of SEQ ID NO:1, SEQ ID 

15 NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:13, SEQ ID 
NO:15, SEQ ID NO:37, SEQ ID NO:39, SEQ ID NO:41, SEQ ID NO:43, SEQ ID NO:45 f SEQ 
ID NO:47, SEQ ID NO:49, SEQ ID NO:51, SEQ ID NO:53, SEQ ID NO:55, SEQ ID NO:57, 
SEQ ID NO:59, SEQ ID NO:65, SEQ ID NO:17, SEQ ID NO:18, SEQ ID NO:19, SEQ ID 
NO:20, SEQ ID NO:21, SEQ ID NO:22, SEQ ID NO:23, SEQ ID NO:24, SEQ ID NO:25, SEQ 

20 ID NO:26, SEQ ID NO:27, SEQ ID NO:28 ? SEQ ID NO:29, SEQ ID NO:30, SEQ ID NO:31, 
SEQ ID NO:32, SEQ ID NO:33, SEQ ID NO:34, SEQ ID NO:35, SEQ ID NO:36, SEQ ID 
NO:61, SEQ ID NO:62, SEQ ID NO:63, SEQ ID NO:64, and SEQ ID NO:67, or a 
subsequence thereof, as well as an amino acid sequence selected from the group consisting 
of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12 ( 

25 SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:38, SEQ ID NO:40, SEQ ID NO:42, SEQ ID 
NO:44, SEQ ID NO:46 t SEQ ID NO:48, SEQ ID NO:50 t SEQ ID NO:52, SEQ ID NO:54, SEQ 
ID NO:56, SEQ ID NO:58, SEQ ID NO:60, and SEQ ID NO:66, or a fragment thereof, may be 
used to design a polynucleotide probe to identify and clone DNA encoding polypeptides having 
cellobiohydrolase I activity from strains of different genera or species according to methods 

30 well known in the art. In particular, such probes can be used for hybridization with the 
genomic or cDNA of the genus or species of interest, following standard Southern blotting 
procedures, in order to identify and isolate the corresponding gene therein. Such probes can 
be considerably shorter than the entire sequence, but should be at least 15, preferably at least 
25, more preferably at least 35 nucleotides in length, such as at least 70 nucleotides in length. 

35 It is, however, preferred that the polynucleotide probe is at least 100 nucleotides in length. For 
example, the polynucleotide probe may be at least 200 nucleotides in length, at least 300 
nucleotides in length, at least 400 nucleotides in length or at least 500 nucleotides in length. 
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Even longer probes may be used, e.g., polynucleotide probes which are at least 600 
nucleotides in length, at least 700 nucleotides in length, at least 800 nucleotides in length, or 
at least 900 nucleotides in length. Both DNA and RNA probes can be used. The probes are 
typically labeled for detecting the corresponding gene (for example, with 32 P, 3 H, 35 S, biotin, or 
5 avidin). 

Thus, a genomic DNA or cDNA library prepared from such other organisms may be 
screened for DNA which hybridizes with the probes described above and which encodes a 
polypeptide having cellobiohydrolase I activity. Genomic or other DNA from such other 
organisms may be separated by agarose or polyacrylamide gel electrophoresis, or other 
10 separation techniques. DNA from the libraries or the separated DNA may be transferred to, 
and immobilized, on nitrocellulose or other suitable carrier materials. In order to identify a 
clone or DNA which is homologous with SEQ ID NO:1 the carrier material with the immobilized 
DNA is used in a Southern blot. 

For purposes of the present invention, hybridization indicates that the nucleotide 
15 sequence hybridizes to a labeled polynucleotide probe which hybridizes to the nucleotide 
sequence shown in SEQ ID NO:1 under very low to very high stringency conditions. 
Molecules to which the polynucleotide probe hybridizes under these conditions may be 
detected using X-ray film or by any other method known in the art. Whenever the term 
"polynucleotide probe" is used in the present context, it is to be understood that such a probe 
20 contains at least 15 nucleotides. 

In an interesting embodiment, the polynucleotide probe is the complementary strand of 
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ID 
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1 to 


500 
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ID 
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nucleot 
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SEQ 
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SEQ 
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nucleot 
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1 to 
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of 


SEQ 


ID 
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1 to 
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of 


SEQ 
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NO:20. 




nucleot 
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1 to 


232 
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SEQ 
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nucleot 
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1 to 
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SEQ 
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nucleot 
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ID 
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nucleot 
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SEQ 
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NO:24, 
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NO:25, 




nucleot 
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1 to 


492 
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SEQ 
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NO:26, 




nucleot 
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1 to 


481 
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SEQ 


ID 


NO:27, 




nucleot 


ides ' 


1 to 


463 


of 


SEQ 


ID 


NO:28, 




nucleot 


ides ' 


1 to 
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SEQ 


ID 


NO:29, 


25 


nucleot 


ides ' 


1 to 
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SEQ 


ID 


NO:30, 




nucleot 


ides ' 


1 to 
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SEQ 


ID 


NO:31, 




nucleot 


ides ' 


1 to 


477 
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SEQ 


ID 


NO:32, 




nucleot 


ides ' 


1 to 


500 
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SEQ 


ID 


NO:33, 




nucleot 


ides ' 


1 to 
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of 


SEQ 


ID 


NO:34, 


30 
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nucleot 


ides ' 
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NO:36, 
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ides ' 


1 to 
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SEQ 


ID 


NO:61, 




nucleot 


ides ' 


1 to 


497 
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SEQ 


ID 


NO:62, 




nucleot 


ides ' 


1 to 


498 


of 


SEQ 


ID 


NO:63, 


35 


nucleot 


ides ' 


1 to 


525 


of 


SEQ 


ID 


NO:64, 




nucleot 


ides ' 


1 to 


951 


of 


SEQ 


ID 


NO:67; 



or the complementary strand of the nucleotides selected from the group consisting of: 
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nucleotides 1 to 200 of SEQ ID NO:64, and 
nucleotides 1 to 200 of SEQ ID NO:67. 

In another interesting embodiment, the polynucleotide probe is the complementary 
strand of the nucleotide sequence which encodes a polypeptide selected from the group 
5 consisting of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ 
ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:38 t SEQ ID NO:40, SEQ ID NO:42, 
SEQ ID NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ ID 
NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:60, and SEQ ID NO:66. In a further 
interesting embodiment, the polynucleotide probe is the complementary strand of a nucleotide 

10 sequence selected from the group consisting of SEQ ID NO:1, SEQ ID NO:3 t SEQ ID NO:5, 
SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:37, 
SEQ ID NO:39, SEQ ID NO:41, SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, SEQ ID 
NO:49, SEQ ID NO:51, SEQ ID NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID NO:59, and 
SEQ ID NO:65. In another interesting embodiment, the polynucleotide probe is the 

15 complementary strand of the nucleotide sequence contained in a plasmid which is contained in 
a deposited microorganism selected from the group consisting of CGMCC No. 0584, CGMCC 
No. 0581, CGMCC No. 0585, CGMCC No. 0582, CGMCC No. 0583, CGMCC No. 0580, CBS 
109513, DSM 14348, DSM 15064, DSM 15065, DSM 15066, DSM 15067, CGMCC No. 0747, 
CGMCC No. 0748, CGMCC No. 0749, and CGMCC No. 0750. 

20 For long probes of at least 100 nucleotides in length, very low to very high stringency 

conditions are defined as prehybridization and hybridization at 42°C in 5X SSPE, 1.0% SDS, 
5X Denhardt's solution, 100 |j,g/ml sheared and denatured salmon sperm DNA, following 
standard Southern blotting procedures. Preferably, the long probes of at least 100 nucleotides 
do not contain more than 1000 nucleotides. For long probes of at least 100 nucleotides in 

25 length, the carrier material is finally washed three times each for 15 minutes using 2 x SSC, 
0.1% SDS at 42°C (very low stringency), preferably washed three times each for 15 minutes 
using 0.5 x SSC, 0.1% SDS at 42°C (low stringency), more preferably washed three times 
each for 15 minutes using 0.2 x SSC, 0.1% SDS at 42°C (medium stringency), even more 
preferably washed three times each for 15 minutes using 0.2 x SSC, 0.1% SDS at 55°C 

30 (medium-high stringency), most preferably washed three times each for 15 minutes using 0.1 
x SSC, 0.1% SDS at 60°C (high stringency), in particular washed three times each for 15 
minutes using 0.1 x SSC, 0.1% SDS at 68°C (very high stringency). 

Although not particularly preferred, it is contemplated that shorter probes, e.g. probes 
which are from about 15 to 99 nucleotides in length, such as from about 15 to about 70 

35 nucleotides in length, may be also be used. For such short probes, stringency conditions are 
defined as prehybridization, hybridization, and washing post-hybridization at 5°C to 10°C 
below the calculated T m using the calculation according to Bolton and McCarthy (1962, 
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Proceedings of the National Academy of Sciences USA 48:1390) in 0.9 M NaCI, 0.09 M Tris- 
HCI pH 7.6, 6 mM EDTA, 0.5% NP-40, 1X Denhardt's solution, 1 mM sodium pyrophosphate, 
1 mM sodium monobasic phosphate, 0.1 mM ATP, and 0.2 mg of yeast RNA per ml following 
standard Southern blotting procedures. 
5 For short probes which are about 15 nucleotides to 99 nucleotides in length, the carrier 

material is washed once in 6X SCC plus 0.1% SDS for 15 minutes and twice each for 15 
minutes using 6X SSC at 5°C to 10°C below the calculated T m . 



Sources for Polypeptides Having Cellobiohydrolase I Activity 

10 A polypeptide of the present invention may be obtained from microorganisms of any 

genus. For purposes of the present invention, the term "obtained from" as used herein shall 
mean that the polypeptide encoded by the nucleotide sequence is produced by a cell in which 
the nucleotide sequence is naturally present or into which the nucleotide sequence has been 
inserted. In a preferred embodiment, the polypeptide is secreted extracellularly. 

15 A polypeptide of the present invention may be a bacterial polypeptide. For example, the 

polypeptide may be a gram positive bacterial polypeptide such as a Bacillus polypeptide, e.g., 
a Bacillus alkalophilus, Bacillus amyloliquefaciens, Bacillus brevis, Bacillus circulans, Bacillus 
coagulans, Bacillus lautus, Bacillus lentus, Bacillus licheniformis, Bacillus megaterium, Bacillus 
stearothermophilus, Bacillus subtilis, or Bacillus thuringiensis polypeptide; or a Streptomyces 

20 polypeptide, e.g., a Streptomyces lividans or Streptomyces murinus polypeptide; or a gram 
negative bacterial polypeptide, e.g., an E. coli or a Pseudomonas sp. polypeptide. 

A polypeptide of the present invention may be a fungal polypeptide, and more preferably 
a yeast polypeptide such as a Candida, Kluyveromyces, Neocallimastix, Pichia, Piromyces, 
Saccharomyces, Schizosaccharomyces, or Yarrowia polypeptide; or more preferably a 

25 filamentous fungal polypeptide such as an Acremonium, Aspergillus, Aureobasidium, 
Cryptococcus, Filibasidium, Fusarium, Humicola, Magnaporthe, Mucor, Myceliophthora, 
Neurospora, Paecilomyces, Penicillium, Schizophyllum, Talaromyces, Thermoascus, 
Thielavia, Tolypocladium, or Trichoderma polypeptide. 

In an interesting embodiment, the polypeptide is a Saccharomyces carlsbergensis, 

30 Saccharomyces cerevisiae, Saccharomyces diastaticus, Saccharomyces douglasii, 
Saccharomyces kluyveri, Saccharomyces norbensis or Saccharomyces oviformis polypeptide. 

In another interesting embodiment, the polypeptide is an Aspergillus aculeatus, 
Aspergillus awamori, Aspergillus foetid us, Aspergillus japonicus, Aspergillus nidulans, 
Aspergillus niger, Aspergillus oryzae, Fusarium bactridioides, Fusarium cerealis, Fusarium 

35 crookwellense, Fusarium culmorum, Fusarium graminearum, Fusarium graminum, Fusarium 
heterosporum, Fusarium negundi, Fusarium oxysporum, Fusarium reticulatum, Fusarium 
roseum, Fusarium sambucinum, Fusarium sarcochroum, Fusarium sporotrichioides, Fusarium 

30 
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sulphureum, Fusarium torulosum, Fusarium trichothecioides, Fusarium venenatum, Humicola 
insolens, Humicola lanuginosa, Mucor miehei, Myceliophthora thermophila, Neurospora 
crassa, Penicillium purpurogenum, Trichoderma harzianum, Trichoderma koningii, 
Trichoderma longibrachiatum, Trichoderma reesei, or Trichoderma viride polypeptide. 
5 In a preferred embodiment, the polypeptide is a Acremonium thermophilum, 

Chaetomium thermophilum, Scytalidium sp., Scytalidium thermophilum, Thermoascus 
aurantiacus, Thielavia australiensis, Verticillium tenerum, Neotermes castaneus, 
Melanocarpus albomyces, Poitrasia circinans, Coprinus cinereus, Trichothecium roseum, 
Humicola nigrescens, Cladorrhinum foecundissimum, Diplodia gossypina, Myceliophthora 

10 thermophila, Rhizomucor pusillus, Meripilus giganteus, Exidia glandulosa, Xylaria hypoxylon, 
Trichophaea saccata, Acremonium sp., Chaetomium sp., Chaetomidium pingtungium, 
Myceliophthora thermophila, Myceliophthora hinnulea, Sporotrichum pruinosum, Thielavia of. 
microspora, Aspergillus sp., Scopulariopsis sp., Fusarium sp., Verticillium sp., 
Pseudoplectania nigrella, or Phytophthora infestans polypeptide. 

15 In a more preferred embodiment, the polypeptide is a Acremonium thermophilum, 

Chaetomium thermophilum, Scytalidium sp., Scytalidium thermophilum, Thermoascus 
aurantiacus, Thielavia australiensis, Verticillium tenerum, Neotermes castaneus, 
Melanocarpus albomyces, Poitrasia circinans, or Coprinus cinereus polypeptide, e.g., the 
polypeptide consisting of an amino acid sequence selected from the group consisting of SEQ 

20 ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID 
NO:14, SEQ ID NO:16, SEQ ID NO:38, SEQ ID NO:40, SEQ ID NO:42, SEQ ID NO:44, SEQ 
ID NO:46, SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ ID NO:54, SEQ ID NO:56, 
SEQ ID NO:58, SEQ ID NO:60, and SEQ ID NO:66. 

It will be understood that for the aforementioned species, the invention encompasses 

25 both the perfect and imperfect states, and other taxonomic equivalents, e.g., anamorphs, 
regardless of the species name by which they are known. Those skilled in the art will readily 
recognize the identity of appropriate equivalents. 

Strains of these species are readily accessible to the public in a number of culture 
collections, such as the American Type Culture Collection (ATCC), Deutsche Sammlung von 

30 Mikroorganismen und Zellkulturen GmbH (DSMZ), China General Microbiological Culture 
Collection Center (CGMCC), Centraalbureau Voor Schimmelcultures (CBS), and Agricultural 
Research Service Patent Culture Collection, Northern Regional Research Center (NRRL). 

Furthermore, such polypeptides may be identified and obtained from other sources 
including microorganisms isolated from nature (e.g., soil, water, plants, animals, etc.) using the 

35 above-mentioned probes. Techniques for isolating microorganisms from natural habitats are 
well known in the art. The nucleotide sequence may then be derived by similarly screening a 
genomic or cDNA library of another microorganism. Once a nucleotide sequence encoding a 
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polypeptide has been detected with the probe(s), the sequence may be isolated or cloned by 
utilizing techniques which are known to those of ordinary skill in the art (see, e.g., Sambrook et 
a/., 1989, supra). 

Polypeptides encoded by nucleotide sequences of the present invention also include 
5 fused polypeptides or cleavable fusion polypeptides in which another polypeptide is fused at 
the N-terminus or the C-terminus of the polypeptide or fragment thereof. A fused polypeptide 
is produced by fusing a nucleotide sequence (or a portion thereof) encoding another 
polypeptide to a nucleotide sequence (or a portion thereof) of the present invention. 
Techniques for producing fusion polypeptides are known in the art, and include ligating the 
10 coding sequences encoding the polypeptides so that they are in frame and that expression of 
the fused polypeptide is under control of the same promoter(s) and terminator. 



Polynucleotides and Nucleotide Sequences 

The present invention also relates to polynucleotides having a nucleotide sequence 

15 which encodes for a polypeptide of the invention. In particular, the present invention relates to 
polynucleotides consisting of a nucleotide sequence which encodes for a polypeptide of the 
invention. In a preferred embodiment, the nucleotide sequence is selected from the group 
consisting of SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ 
ID NO:11, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:37, SEQ ID NO:39, SEQ ID NO:41, 

20 SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:51, SEQ ID 
NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID NO:59, and SEQ ID NO:65. In a more 
preferred embodiment, the nucleotide sequence is the mature polypeptide coding region 
contained in a plasmid which is contained in a deposited microorganism selected from the 
group consisting of CGMCC No. 0584, CGMCC No. 0581, CGMCC No. 0585, CGMCC No. 

25 0582, CGMCC No. 0583, CGMCC No. 0580, CBS 109513, DSM 14348, DSM 15064, DSM 
15065, DSM 15066, DSM 15067, CGMCC No. 0747, CGMCC No. 0748, CGMCC No. 0749, 
and CGMCC No. 0750. The present invention also encompasses polynucleotides comprising, 
preferably consisting of, nucleotide sequences which encode a polypeptide consisting of an 
amino acid sequence selected from the group consisting of SEQ ID NO:2, SEQ ID NO:4, SEQ 

30 ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ 
ID NO:38, SEQ ID NO:40, SEQ ID NO:42, SEQ ID NO:44, SEQ ID NO:46, SEQ ID NO:48, 
SEQ ID NO:50, SEQ ID NO:52, SEQ ID NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID 
NO:60, and SEQ ID NO:66, which differ from a nucleotide sequence selected from the group 
consisting of SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ 

35 ID NO:11, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:37, SEQ ID NO:39, SEQ ID NO:41, 
SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:51, SEQ ID 
NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID NO:59, and SEQ ID NO:65 by virtue of the 
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degeneracy of the genetic code. 

The present invention also relates to polynucleotides comprising, preferably consisting 
of, a subsequence of a nucleotide sequence selected from the group consisting of SEQ ID 
NO:1, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:11, SEQ ID 
5 NO:13, SEQ ID NO:15, SEQ ID NO:37, SEQ ID NO:39, SEQ ID NO:41, SEQ ID NO:43, SEQ 
ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:51, SEQ ID NO:53, SEQ ID NO:55, 
SEQ ID NO:57, SEQ ID NO:59, and SEQ ID NO:65 which encode fragments of an amino acid 
sequence selected from the group consisting of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, 
SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID 

10 NO:38, SEQ ID NO:40, SEQ ID NO:42, SEQ ID NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ 
ID NO:50, SEQ ID NO:52, SEQ ID NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:60, 
and SEQ ID NO:66 that have cellobiohydrolase I activity. A subsequence of a nucleotide 
sequence selected from the group consisting of SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, 
SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:37, 

15 SEQ ID NO:39, SEQ ID NO:41, SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, SEQ ID 
NO:49, SEQ ID NO:51, SEQ ID NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID NO:59, and 
SEQ ID NO:65 is a nucleotide sequence encompassed by a sequence selected from the 
group consisting of SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, 
SEQ ID NO:11, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:37, SEQ ID NO:39, SEQ ID 

20 NO:41, SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:51, SEQ 
ID NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID NO:59, and SEQ ID NO:65 except that one 
or more nucleotides from the 5' and/or 3' end have been deleted. 

The present invention also relates to polynucleotides having, preferably consisting of, a 
modified nucleotide sequence which comprises at least one modification in the mature 

25 polypeptide coding sequence selected from the group consisting of SEQ ID NO:1, SEQ ID 
NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:13, SEQ ID 
NO:15, SEQ ID NO:37, SEQ ID NO:39, SEQ ID NO:41, SEQ ID NO:43, SEQ ID NO:45, SEQ 
ID NO:47, SEQ ID NO:49, SEQ ID NO:51, SEQ ID NO:53, SEQ ID NO:55, SEQ ID NO:57, 
SEQ ID NO:59, and SEQ ID NO:65, and where the modified nucleotide sequence encodes a 

30 polypeptide which consists of an amino acid sequence selected from the group consisting of 
SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, 
SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:38, SEQ ID NO:40, SEQ ID NO:42, SEQ ID 
NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ ID NO:54, SEQ 
ID NO:56, SEQ ID NO:58, SEQ ID NO:60, and SEQ ID NO:66. 

35 The techniques used to isolate or clone a nucleotide sequence encoding a polypeptide 

are known in the art and include isolation from genomic DNA, preparation from cDNA, or a 
combination thereof. The cloning of the nucleotide sequences of the present invention from 
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such genomic DNA can be effected, e.g., by using the well known polymerase chain reaction 
(PCR) or antibody screening of expression libraries to detect cloned DNA fragments with 
shared structural features. See, e.g., Innis et a/., 1990, PCR: A Guide to Methods and 
Application, Academic Press, New York. Other amplification procedures such as ligase chain 
5 reaction (LCR), ligated activated transcription (LAT) and nucleotide sequence-based 
amplification (NASBA) may be used. The nucleotide sequence may be cloned from a strain 
selected from the group consisting of Acremonium, Scytalidium, Thermoascus, Thielavia, 
Verticillium, Neotermes, Melanocarpus, Poitrasia, Coprinus, Trichothecium, Humicola, 
Cladorrhinum, Diplodia, Myceliophthora, Rhizomucor, Meripilus, Exidia, Xylaria, Trichophaea, 
10 Chaetomium, Chaetomidium, Sporotrichum, Thielavia, Aspergillus, Scopulariopsis, Fusarium, 
Pseudoplectania, and Phytophthora, or another or related organism and thus, for example, 
may be an allelic or species variant of the polypeptide encoding region of the nucleotide 
sequence. 

The nucleotide sequence may be obtained by standard cloning procedures used in 
15 genetic engineering to relocate the nucleotide sequence from its natural location to a different 
site where it will be reproduced. The cloning procedures may involve excision and isolation of 
a desired fragment comprising the nucleotide sequence encoding the polypeptide, insertion of 
the fragment into a vector molecule, and incorporation of the recombinant vector into a host 
cell where multiple copies or clones of the nucleotide sequence will be replicated. The 
20 nucleotide sequence may be of genomic, cDNA, RNA, semisynthetic, synthetic origin, or any 
combinations thereof. 

The present invention also relates to a polynucleotide comprising, preferably consisting 

of, a nucleotide sequence which has a degree of identity with a nucleotide sequence selected 

from the group consisting of 
25 nucleotides 1 to 1578 of SEQ ID NO:1, 

nucleotides 1 to 1587 of SEQ ID NO:3, 

nucleotides 1 to 1353 of SEQ ID NO:5, 

nucleotides 1 to 1371 of SEQ ID NO:7, 

nucleotides 1 to 1614 of SEQ ID NO:9, 
30 nucleotides 1 to 1245 of SEQ ID NO:11, 

nucleotides 1 to 1341 of SEQ ID NO:13, 

nucleotides 1 to 1356 of SEQ ID NO:15, 

nucleotides 1 to 1365 of SEQ ID NO:37, 

nucleotides 1 to 1377 of SEQ ID NO:39, 
35 nucleotides 1 to 1353 of SEQ ID NO;41, 

nucleotides 1 to 1341 of SEQ ID NO:43, 

nucleotides 1 to 1584 of SEQ ID NO:45, 
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nucleotides 1 to 218 of SEQ ID NO:25, 

nucleotides 1 to 492 of SEQ ID NO:26, 

nucleotides 1 to 481 of SEQ ID NO: 27, 

nucleotides 1 to 463 of SEQ ID NO:28, 
5 nucleotides 1 to 51 3 of SEQ ID NO:29, 

nucleotides 1 to 579 of SEQ ID NO: 30, 

nucleotides 1 to 514 of SEQ ID NO:31, 

nucleotides 1 to 477 of SEQ ID NO:32, 

nucleotides 1 to 500 of SEQ ID NO:33, 
10 nucleotides 1 to 470 of SEQ ID NO:34, 

nucleotides 1 to 491 of SEQ ID NO:35, 

nucleotides 1 to 221 of SEQ ID NO: 36, 

nucleotides 1 to 519 of SEQ ID NO:61, 

nucleotides 1 to 497 of SEQ ID NO:62, 
15 nucleotides 1 to 498 of SEQ ID NO:63, 

nucleotides 1 to 525 of SEQ ID NO:64, and 

nucleotides 1 to 951 of SEQ ID NO:67 

of at least 70% identity, such as at least 75% identity; preferably, the nucleotide sequence has 
at least 80% identity, e.g. at least 85% identity, such as at least 90% identity, more preferably 
20 at least 95% identity, such as at least 96% identity, e.g. at least 97% identity, even more 
preferably at least 98% identity, such as at least 99%. Preferably, the nucleotide sequence 
encodes a polypeptide having cellobiohydrolase I activity. The degree of identity between two 
nucleotide sequences is determined as described previously (see the section entitled 
"Definitions"). 

25 In another interesting aspect, the present invention relates to a polynucleotide having, 

preferably consisting of, a nucleotide sequence which has at least 65% identity with the 
cellobiohydrolase I encoding part of the nucleotide sequence inserted into a plasmid present in 
a deposited microorganism selected from the group consisting of CGMCC No. 0584, CGMCC 
No. 0581, CGMCC No. 0585, CGMCC No. 0582, CGMCC No. 0583, CGMCC No. 0580, CBS 

30 109513, DSM 14348, DSM 15064, DSM 15065, DSM 15066, DSM 15067, CGMCC No. 0747, 
CGMCC No. 0748, CGMCC No. 0749, and CGMCC No. 0750. In a preferred embodiment, the 
degree of identity with the cellobiohydrolase I encoding part of the nucleotide sequence 
inserted into a plasmid present in a deposited microorganism selected from the group 
consisting of CGMCC No. 0584, CGMCC No. 0581, CGMCC No. 0585, CGMCC No. 0582, 

35 CGMCC No. 0583, CGMCC No. 0580, CBS 109513, DSM 14348, DSM 15064, DSM 15065, 
DSM 15066, DSM 15067, CGMCC No. 0747, CGMCC No. 0748, CGMCC No. 0749, and 
CGMCC No. 0750 is at least 70%, e.g. at least 80%, such as at least 90%, more preferably at 
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least 95%, such as at least 96%, e.g. at least 97%, even more preferably at least 98%, such 
as at least 99%. Preferably, the nucleotide sequence comprises the cellobiohydrolase I 
encoding part of the nucleotide sequence inserted into a plasmid present in a deposited 
microorganism selected from the group consisting of CGMCC No. 0584, CGMCC No. 0581, 
5 CGMCC No. 0585, CGMCC No. 0582, CGMCC No. 0583, CGMCC No. 0580, CBS 109513, 
DSM 14348, DSM 15064, DSM 15065, DSM 15066, DSM 15067, CGMCC No. 0747, CGMCC 
No. 0748, CGMCC No. 0749, and CGMCC No. 0750. In an even more preferred embodiment, 
the nucleotide sequence consists of the cellobiohydrolase I encoding part of the nucleotide 
sequence inserted into a plasmid present in a deposited microorganism selected from the 

10 group consisting of CGMCC No. 0584, CGMCC No. 0581, CGMCC No. 0585, CGMCC No. 
0582, CGMCC No. 0583, CGMCC No. 0580, CBS 109513, DSM 14348, DSM 15064, DSM 
15065, DSM 15066, DSM 15067, CGMCC No. 0747, CGMCC No. 0748, CGMCC No. 0749, 
and CGMCC No. 0750. 

Modification of a nucleotide sequence encoding a polypeptide of the present invention 

15 may be necessary for the synthesis of a polypeptide, which comprises an amino acid 
sequence that has at least one substitution, deletion and/or insertion as compared to an amino 
acid sequence selected from the group consisting of SEQ ID NO:2, SEQ ID NO:4, SEQ ID 
NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID 
NO:38, SEQ ID NO:40, SEQ ID NO:42, SEQ ID NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ 

20 ID NO:50, SEQ ID NO:52, SEQ ID NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:60, 
and SEQ ID NO:66. These artificial variants may differ in some engineered way from the 
polypeptide isolated from its native source, e.g., variants that differ in specific activity, 
thermostability, pH optimum, or the like. 

It will be apparent to those skilled in the art that such modifications can be made outside 

25 the regions critical to the function of the molecule and still result in an active polypeptide. 
Amino acid residues essential to the activity of the polypeptide encoded by the nucleotide 
sequence of the invention, and therefore preferably not subject to modification, such as 
substitution, may be identified according to procedures known in the art, such as site-directed 
mutagenesis or alanine-scanning mutagenesis (see, e.g., Cunningham and Wells, 1989, 

30 Science 244: 1081-1085). In the latter technique, mutations are introduced at every positively 
charged residue in the molecule, and the resultant mutant molecules are tested for 
cellobiohydrolase I activity to identify amino acid residues that are critical to the activity of the 
molecule. Sites of substrate-enzyme interaction can also be determined by analysis of the 
three-dimensional structure as determined by such techniques as nuclear magnetic resonance 

35 analysis, crystallography or photoaffinity labelling (see, e.g., de Vos et a/., 1992, Science 255: 
306-312; Smith et a/., 1992, Journal of Molecular Biology 224: 899-904; Wlodaver et al. , 1992, 
FEBS Letters 309: 59-64). 

37 
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Moreover, a nucleotide sequence encoding a polypeptide of the present invention may 
be modified by introduction of nucleotide substitutions which do not give rise to another amino 
acid sequence of the polypeptide encoded by the nucleotide sequence, but which correspond 
to the codon usage of the host organism intended for production of the enzyme. 
5 The introduction of a mutation into the nucleotide sequence to exchange one nucleotide 

for another nucleotide may be accomplished by site-directed mutagenesis using any of the 
methods known in the art. Particularly useful is the procedure, which utilizes a supercoiled, 
double stranded DNA vector with an insert of interest and two synthetic primers containing the 
desired mutation. The oligonucleotide primers, each complementary to opposite strands of 

10 the vector, extend during temperature cycling by means of Pfu DNA polymerase. On 
incorporation of the primers, a mutated plasmid containing staggered nicks is generated. 
Following temperature cycling, the product is treated with Dpn\ which is specific for methylated 
and hemimethylated DNA to digest the parental DNA template and to select for mutation- 
containing synthesized DNA. Other procedures known in the art may also be used. For a 

15 general description of nucleotide substitution, see, e.g., Ford et aL, 1991, Protein Expression 
and Purification 2: 95-107. 

The present invention also relates to a polynucleotide comprising, preferably consisting 
of, a nucleotide sequence which encodes a polypeptide having cellobiohydrolase I activity, and 
which hybridizes under very low stringency conditions, preferably under low stringency 

20 conditions, more preferably under medium stringency conditions, more preferably under 
medium-high stringency conditions, even more preferably under high stringency conditions, 
and most preferably under very high stringency conditions with a polynucleotide probe 
selected from the group consisting of 

(i) the complementary strand of the nucleotides selected from the group consisting of: 
25 nucleotides 1 to 1578 of SEQ ID NO:1, 

nucleotides 1 to 1302 of SEQ ID NO:1, 

nucleotides 1 to 1587 of SEQ ID NO:3, 

nucleotides 1 to 1302 of SEQ ID NO:3, 

nucleotides 1 to 1353 of SEQ ID NO:5, 
30 nucleotides 1 to 1302 of SEQ ID NO:5, 

nucleotides 1 to 1371 of SEQ ID NO:7, 

nucleotides 1 to 1302 of SEQ ID NO:7, 

nucleotides 1 to 1614 of SEQ ID NO:9, 

nucleotides 1 to 1302 of SEQ ID NO:9, 
35 nucleotides 1 to 1245 of SEQ ID NO:11, 

nucleotides 1 to 1341 of SEQ ID NO:13, 

nucleotides 1 to 1302 of SEQ ID NO: 13, 
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nucleotides 1 to 951 of SEQ ID NO:67; and 

(iii) the complementary strand of the nucleotides selected from the group consisting of: 
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nucleotides 1 to 200 of SEQ ID NO:62, 
nucleotides 1 to 200 of SEQ ID NO:63, 
nucleotides 1 to 200 of SEQ ID NO:64, and 
nucleotides 1 to 200 of SEQ ID NO:67. 
5 As will be understood, details and particulars concerning hybridization of the nucleotide 

sequences will be the same or analogous to the hybridization aspects discussed in the section 
entitled "Polypeptides Having Cellobiohydrolase I Activity" herein. 



Nucleic Acid Constructs 

10 The present invention also relates to nucleic acid constructs comprising a nucleotide 

sequence of the present invention operably linked to one or more control sequences that 
direct the expression of the coding sequence in a suitable host cell under conditions 
compatible with the control sequences. 

A nucleotide sequence encoding a polypeptide of the present invention may be 

15 manipulated in a variety of ways to provide for expression of the polypeptide. Manipulation of 
the nucleotide sequence prior to its insertion into a vector may be desirable or necessary 
depending on the expression vector. The techniques for modifying nucleotide sequences 
utilizing recombinant DNA methods are well known in the art. 

The control sequence may be an appropriate promoter sequence, a nucleotide 

20 sequence which is recognized by a host cell for expression of the nucleotide sequence. The 
promoter sequence contains transcriptional control sequences, which mediate the expression 
of the polypeptide. The promoter may be any nucleotide sequence which shows 
transcriptional activity in the host cell of choice including mutant, truncated, and hybrid 
promoters, and may be obtained from genes encoding extracellular or intracellular 

25 polypeptides either homologous or heterologous to the host cell. 

Examples of suitable promoters for directing the transcription of the nucleic acid 
constructs of the present invention, especially in a bacterial host cell, are the promoters 
obtained from the E. coli lac operon, Streptomyces coelicolor agarase gene (dagA), Bacillus 
subtilis levansucrase gene (sacB), Bacillus licheniformis alpha-amylase gene (amyL), Bacillus 

30 stearothermophilus maltogenic amylase gene (amyM), Bacillus amyloliquefaciens alpha- 
amylase gene (an?yQ), Bacillus licheniformis penicillinase gene (penP), Bacillus subtilis xylA 
and xylB genes, and prokaryotic beta-lactamase gene (Villa-Kamaroff et a/., 1978, 
Proceedings of the National Academy of Sciences USA 75: 3727-3731), as well as the tac 
promoter (DeBoer et a/., 1983, Proceedings of the National Academy of Sciences USA 80: 21- 

35 25). Further promoters are described in "Useful proteins from recombinant bacteria" in 
Scientific American, 1980, 242: 74-94; and in Sambrook et a/., 1989, supra. 

Examples of suitable promoters for directing the transcription of the nucleic acid 
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constructs of the present invention in a filamentous fungal host cell are promoters obtained 
from the genes for Aspergillus oryzae TAKA amylase, Rhizomucor miehei aspartic proteinase, 
Aspergillus niger neutral alpha-amylase, Aspergillus niger acid stable alpha-amylase, 
Aspergillus niger or Aspergillus awamori glucoamylase (glaA), Rhizomucor miehei lipase, 
5 Aspergillus oryzae alkaline protease, Aspergillus oryzae triose phosphate isomerase, 
Aspergillus nidulans acetamidase, and Fusarium oxysporum trypsin-like protease (WO 
96/00787), as well as the NA2-tpi promoter (a hybrid of the promoters from the genes for 
Aspergillus niger neutral alpha-amylase and Aspergillus oryzae triose phosphate isomerase), 
and mutant, truncated, and hybrid promoters thereof. 

10 In a yeast host, useful promoters are obtained from the genes for Saccharomyces 

cerevisiae enolase (ENO-1), Saccharomyces cerevisiae galactokinase (GAL1), 
Saccharomyces cerevisiae alcohol dehydrogenase/glyceraldehyde-3-phosphate 
dehydrogenase (ADH2/GAP), and Saccharomyces cerevisiae 3-phosphoglycerate kinase. 
Other useful promoters for yeast host cells are described by Romanos et a/., 1992, Yeast 8: 

15 423-488. 

The control sequence may also be a suitable transcription terminator sequence, a 
sequence recognized by a host cell to terminate transcription. The terminator sequence is 
operably linked to the 3' terminus of the nucleotide sequence encoding the polypeptide. Any 
terminator which is functional in the host cell of choice may be used in the present invention. 
20 Preferred terminators for filamentous fungal host cells are obtained from the genes for 

Aspergillus oryzae TAKA amylase, Aspergillus niger glucoamylase, Aspergillus nidulans 
anthranilate synthase, Aspergillus niger alpha-glucosidase, and Fusarium oxysporum trypsin- 
like protease. 

Preferred terminators for yeast host cells are obtained from the genes for 
25 Saccharomyces cerevisiae enolase, Saccharomyces cerevisiae cytochrome C (CYC1), and 
Saccharomyces cerevisiae glyceraldehyde-3-phosphate dehydrogenase. Other useful 
terminators for yeast host cells are described by Romanos et a/., 1992, supra. 

The control sequence may also be a suitable leader sequence, a nontranslated region of 
an mRNA which is important for translation by the host cell. The leader sequence is operably 
30 linked to the 5* terminus of the nucleotide sequence encoding the polypeptide. Any leader 
sequence that is functional in the host cell of choice may be used in the present invention. 

Preferred leaders for filamentous fungal host cells are obtained from the genes for 
Aspergillus oryzae TAKA amylase and Aspergillus nidulans triose phosphate isomerase. 

Suitable leaders for yeast host cells are obtained from the genes for Saccharomyces 
35 cerevisiae enolase (ENO-1), Saccharomyces cerevisiae 3-phosphoglycerate kinase, 
Saccharomyces cerevisiae alpha-factor, and Saccharomyces cerevisiae alcohol 
dehydrogenase/glyceraldehyde-3-phosphate dehydrogenase (ADH2/GAP). 
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The control sequence may also be a polyadenylation sequence, a sequence operably 
linked to the 3' terminus of the nucleotide sequence and which, when transcribed, is 
recognized by the host cell as a signal to add polyadenosine residues to transcribed mRNA. 
Any polyadenylation sequence which is functional in the host cell of choice may be used in the 
5 present invention. 

Preferred polyadenylation sequences for filamentous fungal host cells are obtained from 
the genes for Aspergillus oryzae TAKA amylase, Aspergillus niger glucoamylase, Aspergillus 
nidulans anthranilate synthase, Fusarium oxysporum trypsin-like protease, and Aspergillus 
niger alpha-glucosidase. 

10 Useful polyadenylation sequences for yeast host cells are described by Guo and 

Sherman, 1995, Molecular Cellular Biology 15: 5983-5990. 

The control sequence may also be a signal peptide coding region that codes for an 
amino acid sequence linked to the amino terminus of a polypeptide and directs the encoded 
polypeptide into the cell's secretory pathway. The 5' end of the coding sequence of the 

15 nucleotide sequence may inherently contain a signal peptide coding region naturally linked in 
translation reading frame with the segment of the coding region which encodes the secreted 
polypeptide. Alternatively, the 5* end of the coding sequence may contain a signal peptide 
coding region which is foreign to the coding sequence. The foreign signal peptide coding 
region may be required where the coding sequence does not naturally contain a signal peptide 

20 coding region. Alternatively, the foreign signal peptide coding region may simply replace the 
natural signal peptide coding region in order to enhance secretion of the polypeptide. 
However, any signal peptide coding region which directs the expressed polypeptide into the 
secretory pathway of a host cell of choice may be used in the present invention. 

Effective signal peptide coding regions for bacterial host cells are the signal peptide 

25 coding regions obtained from the genes for Bacillus NCIB 1 1837 maltogenic amylase, Bacillus 
stearothermophilus alpha-amylase, Bacillus licheniformis subtilisin, Bacillus licheniformis beta- 
lactamase, Bacillus stearothermophilus neutral proteases (nprT, nprS, nprM), and Bacillus 
subtilis prsA. Further signal peptides are described by Simonen and Palva, 1993, 
Microbiological Reviews 57: 109-137. 

30 Effective signal peptide coding regions for filamentous fungal host cells are the signal 

peptide coding regions obtained from the genes for Aspergillus oryzae TAKA amylase, 
Aspergillus niger neutral amylase, Aspergillus niger glucoamylase, Rhizomucor miehei aspartic 
proteinase, Humicola insolens cellulase, and Humicola lanuginosa lipase. 

Useful signal peptides for yeast host cells are obtained from the genes for 

35 Saccharomyces cerevisiae alpha-factor and Saccharomyces cerevisiae invertase. Other 
useful signal peptide coding regions are described by Romanos et a/., 1992, supra. 

The control sequence may also be a propeptide coding region that codes for an amino 
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acid sequence positioned at the amino terminus of a polypeptide. The resultant polypeptide is 
known as a proenzyme or propolypeptide (or a zymogen in some cases). A propolypeptide is 
generally inactive and can be converted to a mature active polypeptide by catalytic or 
autocatalytic cleavage of the propeptide from the propolypeptide. The propeptide coding 
5 region may be obtained from the genes for Bacillus subtilis alkaline protease (aprEE), Bacillus 
subtilis neutral protease (nprT), Saccharomyces cerevisiae alpha-factor, Rhizomucor miehei 
aspartic proteinase, and Myceliophthora thermophila laccase (WO 95/33836). 

Where both signal peptide and propeptide regions are present at the amino terminus of 
a polypeptide, the propeptide region is positioned next to the amino terminus of a polypeptide 
10 and the signal peptide region is positioned next to the amino terminus of the propeptide 
region. 

It may also be desirable to add regulatory sequences which allow the regulation of the 
expression of the polypeptide relative to the growth of the host cell. Examples of regulatory 
systems are those which cause the expression of the gene to be turned on or off in response 

15 to a chemical or physical stimulus, including the presence of a regulatory compound. 
Regulatory systems in prokaryotic systems include the lac, tec, and trp operator systems. In 
yeast, the ADH2 system or GAL1 system may be used. In filamentous fungi, the TAKA alpha- 
amylase promoter, Aspergillus niger glucoamylase promoter, and Aspergillus oryzae 
glucoamylase promoter may be used as regulatory sequences. Other examples of regulatory 

20 sequences are those which allow for gene amplification. In eukaryotic systems, these include 
the dihydrofolate reductase gene which is amplified in the presence of methotrexate, and the 
metallothionein genes which are amplified with heavy metals. In these cases, the nucleotide 
sequence encoding the polypeptide would be operably linked with the regulatory sequence. 



25 Expression Vectors 

The present invention also relates to recombinant expression vectors comprising the 
nucleic acid construct of the invention. The various nucleotide and control sequences 
described above may be joined together to produce a recombinant expression vector which 
may include one or more convenient restriction sites to allow for insertion or substitution of the 

30 nucleotide sequence encoding the polypeptide at such sites. Alternatively, the nucleotide 
sequence of the present invention may be expressed by inserting the nucleotide sequence or 
a nucleic acid construct comprising the sequence into an appropriate vector for expression. In 
creating the expression vector, the coding sequence is located in the vector so that the coding 
sequence is operably linked with the appropriate control sequences for expression. 

35 The recombinant expression vector may be any vector (e.g., a plasmid or virus) which 

can be conveniently subjected to recombinant DNA procedures and can bring about the 
expression of the nucleotide sequence. The choice of the vector will typically depend on the 
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compatibility of the vector with the host cell into which the vector is to be introduced. The 
vectors may be linear or closed circular plasmids. 

The vector may be an autonomously replicating vector, i.e., a vector which exists as an 
extrachromosomal entity, the replication of which is independent of chromosomal replication, 
5 e.g., a plasmid, an extrachromosomal element, a minichromosome, or an artificial 
chromosome. 

The vector may contain any means for assuring self-replication. Alternatively, the vector 
may be one which, when introduced into the host cell, is integrated into the genome and 
replicated together with the chromosome(s) into which it has been integrated. Furthermore, a 

10 single vector or plasmid or two or more vectors or plasmids which together contain the total 
DNA to be introduced into the genome of the host cell, or a transposon may be used. 

The vectors of the present invention preferably contain one or more selectable markers 
which permit easy selection of transformed cells. A selectable marker is a gene the product of 
which provides for biocide or viral resistance, resistance to heavy metals, prototrophy to 

15 auxotrophs, and the like. 

Examples of bacterial selectable markers are the dal genes from Bacillus subtilis or 
Bacillus licheniformis, or markers which confer antibiotic resistance such as ampicillin, 
kanamycin, chloramphenicol or tetracycline resistance. Suitable markers for yeast host cells 
are ADE2, HIS3, LEU2, LYS2, MET3, TRP1, and URA3. Selectable markers for use in a 

20 filamentous fungal host cell include, but are not limited to, amdS (acetamidase), argB 
(ornithine carbamoyltransferase), bar (phosphinothricin acetyltransferase), hygB (hygromycin 
phosphotransferase), niaD (nitrate reductase), pyrG (orotidine-5'-phosphate decarboxylase), 
sC (sulfate adeny transferase), trpC (anthranilate synthase), as well as equivalents thereof. 

Preferred for use in an Aspergillus cell are the amdS and pyrG genes of Aspergillus 

25 nidulans or Aspergillus oryzae and the bar gene of Streptomyces hygroscopicus. 

The vectors of the present invention preferably contain an element(s) that permits stable 
integration of the vector into the host cell's genome or autonomous replication of the vector in 
the cell independent of the genome. 

For integration into the host cell genome, the vector may rely on the nucleotide 

30 sequence encoding the polypeptide or any other element of the vector for stable integration of 
the vector into the genome by homologous or nonhomologous recombination. Alternatively, 
the vector may contain additional nucleotide sequences for directing integration by 
homologous recombination into the genome of the host cell. The additional nucleotide 
sequences enable the vector to be integrated into the host cell genome at a precise location(s) 

35 in the chromosome(s). To increase the likelihood of integration at a precise location, the 
integrational elements should preferably contain a sufficient number of nucleotides, such as 
100 to 1,500 base pairs, preferably 400 to 1,500 base pairs, and most preferably 800 to 1,500 
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base pairs, which are highly homologous with the corresponding target sequence to enhance 
the probability of homologous recombination. The integrational elements may be any 
sequence that is homologous with the target sequence in the genome of the host cell. 
Furthermore, the integrational elements may be non-encoding or encoding nucleotide 
5 sequences. On the other hand, the vector may be integrated into the genome of the host cell 
by non-homologous recombination. 

For autonomous replication, the vector may further comprise an origin of replication 
enabling the vector to replicate autonomously in the host cell in question. Examples of 
bacterial origins of replication are the origins of replication of plasmids pBR322, pUC19, 

10 pACYC177, and pACYC184 permitting replication in E. coli, and pUB110, pE194, pTA1060, 
and pAMB1 permitting replication in Bacillus. Examples of origins of replication for use in a 
yeast host cell are the 2 micron origin of replication, ARS1, ARS4, the combination of ARS1 
and CEN3, and the combination of ARS4 and CEN6. The origin of replication may be one 
having a mutation which makes its functioning temperature-sensitive in the host cell (see, e.g., 

15 Ehrlich, 1978, Proceedings of the National Academy of Sciences USA 75: 1433). 

More than one copy of a nucleotide sequence of the present invention may be inserted 
into the host cell to increase production of the gene product. An increase in the copy number 
of the nucleotide sequence can be obtained by integrating at least one additional copy of the 
sequence into the host cell genome or by including an amplifiable selectable marker gene with 

20 the nucleotide sequence where cells containing amplified copies of the selectable marker 
gene, and thereby additional copies of the nucleotide sequence, can be selected for by 
cultivating the cells in the presence of the appropriate selectable agent. 

The procedures used to ligate the elements described above to construct the 
recombinant expression vectors of the present invention are well known to one skilled in the 

25 art (see, e.g., Sambrook et a/., 1989, supra). 



Host Cells 

The present invention also relates to recombinant a host cell comprising the nucleic acid 
construct of the invention, which are advantageously used in the recombinant production of 

30 the polypeptides. A vector comprising a nucleotide sequence of the present invention is 
introduced into a host cell so that the vector is maintained as a chromosomal integrant or as a 
self-replicating extra-chromosomal vector as described earlier. 

The host cell may be a unicellular microorganism, e.g., a prokaryote, or a non-unicellular 
microorganism, e.g., a eukaryote. 

35 Useful unicellular cells are bacterial cells such as gram positive bacteria including, but 

not limited to, a Bacillus cell, e.g., Bacillus alkalophilus, Bacillus amyloliquefaciens, Bacillus 
brevis, Bacillus circulans, Bacillus clausii, Bacillus coagulans, Bacillus lautus, Bacillus lentus, 
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Bacillus licheniformis, Bacillus megaterium, Bacillus stearothermophilus, Bacillus subtilis, and 
Bacillus thuringiensis; or a Streptomyces cell, e.g., Streptomyces lividans or Streptomyces 
murinus, or gram negative bacteria such as E. coli and Pseudomonas sp. In a preferred 
embodiment, the bacterial host cell is a Bacillus lentus, Bacillus licheniformis, Bacillus 
5 stearothermophilus, or Bacillus subtilis cell. In another preferred embodiment, the Bacillus cell 
is an alkalophilic Bacillus. 

The introduction of a vector into a bacterial host cell may, for instance, be effected by 
protoplast transformation (see, e.g., Chang and Cohen, 1979, Molecular General Genetics 
168: 111-115), using competent cells (see, e.g., Young and Spizizin, 1961, Journal of 
10 Bacteriology 81: 823-829, or Dubnau and Davidoff-Abelson, 1971, Journal of Molecular 
Biology 56: 209-221), electroporation (see, e.g., Shigekawa and Dower, 1988, Biotechniques 
6: 742-751), or conjugation (see, e.g., Koehler and Thorne, 1987, Journal of Bacteriology 169: 
5771-5278). 

The host cell may be a eukaryote, such as a mammalian, insect, plant, or fungal cell. 

15 In a preferred embodiment, the host cell is a fungal cell. "Fungi" as used herein includes 

the phyla Ascomycota, Basidiomycota, Chytridiomycota, and Zygomycota (as defined by 
Hawksworth et a/., In, Ainsworth and Bisby's Dictionary of The Fungi, 8th edition, 1995, CAB 
International, University Press, Cambridge, UK) as well as the Oomycota (as cited in 
Hawksworth et a/., 1995, supra, page 171) and all mitosporic fungi (Hawksworth et ai, 1995, 

20 supra). 

In a more preferred embodiment, the fungal host cell is a yeast cell. "Yeast" as used 
herein includes ascosporogenous yeast (Endomycetales), basidiosporogenous yeast, and 
yeast belonging to the Fungi Imperfecti (Blastomycetes). Since the classification of yeast may 
change in the future, for the purposes of this invention, yeast shall be defined as described in 
25 Biology and Activities of Yeast (Skinner, F.A., Passmore, S.M., and Davenport, R.R., eds, 
Soc. App. Bacteriol. Symposium Series No. 9, 1980). 

In an even more preferred embodiment, the yeast host cell is a Candida, Aschbyii, 
Hansenula, Kluyveromyces, Pichia, Saccharomyces, Schizosaccharomyces, or Yarrowia cell. 

In a most preferred embodiment, the yeast host cell is a Saccharomyces carlsbergensis, 
30 Saccharomyces cerevisiae, Saccharomyces diastaticus, Saccharomyces douglasii, 
Saccharomyces kluyveri, Saccharomyces norbensis or Saccharomyces oviformis cell. In 
another most preferred embodiment, the yeast host cell is a Kluyveromyces lactis cell. In 
another most preferred embodiment, the yeast host cell is a Yarrowia lipolytica cell. 

In another more preferred embodiment, the fungal host cell is a filamentous fungal cell. 
35 "Filamentous fungi" include all filamentous forms of the subdivision Eumycota and Oomycota 
(as defined by Hawksworth et ai, 1995, supra). The filamentous fungi are characterized by a 
mycelial wall composed of chitin, cellulose, glucan, chitosan, mannan, and other complex 
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polysaccharides. Vegetative growth is by hyphal elongation and carbon catabolism is 
obligately aerobic. In contrast, vegetative growth by yeasts such as Saccharomyces 
cerevisiae is by budding of a unicellular thallus and carbon catabolism may be fermentative. 
In an even more preferred embodiment, the filamentous fungal host cell is a cell of a 
5 species of, but not limited to, Acremonium, Aspergillus, Fusarium, Humicola, Mucor, 
Myceliophthora, Neurospora, Penicillium, Thielavia, Tolypocladium, or Trichoderma. 

In a most preferred embodiment, the filamentous fungal host cell is an Aspergillus 
awamori, Aspergillus foetidus, Aspergillus japonicus, Aspergillus nidulans, Aspergillus niger or 
Aspergillus oryzae cell. In another most preferred embodiment, the filamentous fungal host 

10 cell is a Fusarium bactridioides, Fusarium cerealis, Fusarium crookwellense, Fusarium 
culmorum, Fusarium graminearum, Fusarium graminum, Fusarium heterosporum, Fusarium 
negundi, Fusarium oxysporum, Fusarium reticulatum, Fusarium roseum, Fusarium 
sambucinum, Fusarium sarcochroum, Fusarium sporotrichioides, Fusarium sulphureum, 
Fusarium torulosum, Fusarium trichothecioides, or Fusarium venenatum cell. In an even most 

15 preferred embodiment, the filamentous fungal parent cell is a Fusarium venenatum (Nirenberg 
sp. nov.) cell. In another most preferred embodiment, the filamentous fungal host cell is a 
Humicola insolens, Humicola lanuginosa, Mucor miehei, Myceliophthora thermophila, 
Neurospora crassa, Penicillium purpurogenum, Thielavia terrestris t Trichoderma harzianum, 
Trichoderma koningii, Trichoderma longibrachiatum, Trichoderma reesei, or Trichoderma 

20 viride cell. 

Fungal cells may be transformed by a process involving protoplast formation, 
transformation of the protoplasts, and regeneration of the cell wall in a manner known per se. 
Suitable procedures for transformation of Aspergillus host cells are described in EP 238 023 
and Yelton et a/., 1984, Proceedings of the National Academy of Sciences USA 81: 1470- 

25 1474. Suitable methods for transforming Fusarium species are described by Malardier et a/., 
1989, Gene 78: 147-156 and WO 96/00787. Yeast may be transformed using the procedures 
described by Becker and Guarente, In Abelson, J.N. and Simon, M.I., editors, Guide to Yeast 
Genetics and Molecular Biology, Methods in Enzymology, Volume 194, pp 182-187, Academic 
Press, Inc., New York; Ito et a/., 1983, Journal of Bacteriology 153: 163; and Hinnen et a/., 

30 1 978, Proceedings of the National Academy of Sciences USA 75: 1 920. 

Methods of Production 

The present invention also relates to methods for producing a polypeptide of the present 
invention comprising (a) cultivating a strain, which in its wild-type form is capable of producing 
35 the polypeptide; and (b) recovering the polypeptide. Preferably, the strain is selected from the 
group consisting of Acremonium, Scytalidium, Thermoascus, Thielavia, Verticillium, 
Neotermes, Melanocarpus, Poitrasia, Coprinus, Trichothecium, Humicola, Cladorrhinum, 



49 



WO 03/000941 PCT/DK02/00429 

Diplodia, Myceliophthora, Rhizomucor, Meripilus, Exidia, Xylaria, Trichophaea, Chaetomium, 
Chaetomidium, Sporotrichum, Thielavia, Aspergillus, Scopulariopsis, Fusarium, 
Pseudoplectania, and Phytophthora; more preferably the strain is selected from the group 
consisting of Acremonium thermophilum, Chaetomium thermophilum, Scytalidium 
5 thermophilum, Thermoascus aurantiacus, Thielavia australiensis, Verticillium tenerum, 
Neotermes castaneus, Melanocarpus albomyces, Poitrasia circinans, Coprinus cinereus, 
Trichothecium roseum, Humicola nigrescens, Cladorrhinum foecundissimum, Diplodia 
gossypina, Myceliophthora thermophila, Rhizomucor pusillus, Meripilus giganteus, Exidia 
glandulosa, Xylaria hypoxylon, Trichophaea saccata, Chaetomidium pingtungium, 

10 Myceliophthora thermophila, Myceliophthora hinnulea, Sporotrichum pruinosum, Thielavia cf. 
microspora, Pseudoplectania nigrella, and Phytophthora infestans. 

The present invention also relates to methods for producing a polypeptide of the present 
invention comprising (a) cultivating a host cell under conditions conducive for production of the 
polypeptide; and (b) recovering the polypeptide. 

15 The present invention also relates to methods for in-situ production of a polypeptide of 

the present invention comprising (a) cultivating a host cell under conditions conducive for 
production of the polypeptide; and (b) contacting the polypeptide with a desired substrate, 
such as a cellulosic substrate, without prior recovery of the polypeptide. The term "in-situ 
production" is intended to mean that the polypeptide is produced directly in the locus in which 

20 it is intended to be used, such as in a fermentation process for production of ethanol. 

In the production methods of the present invention, the cells are cultivated in a nutrient 
medium suitable for production of the polypeptide using methods known in the art. For 
example, the cell may be cultivated by shake flask cultivation, small-scale or large-scale 
fermentation (including continuous, batch, fed-batch, or solid state fermentations) in laboratory 

25 or industrial fermentors performed in a suitable medium and under conditions allowing the 
polypeptide to be expressed and/or isolated. The cultivation takes place in a suitable nutrient 
medium comprising carbon and nitrogen sources and inorganic salts, using procedures known 
in the art. Suitable media are available from commercial suppliers or may be prepared 
according to published compositions (e.g., in catalogues of the American Type Culture 

30 Collection). If the polypeptide is secreted into the nutrient medium, the polypeptide can be 
recovered directly from the medium. If the polypeptide is not secreted, it can be recovered 
from cell lysates. 

The polypeptides may be detected using methods known in the art that are specific for 
the polypeptides. These detection methods may include use of specific antibodies, formation 
35 of an enzyme product, or disappearance of an enzyme substrate. For example, an enzyme 
assay may be used to determine the activity of the polypeptide as described herein. 

The resulting polypeptide may be recovered by methods known in the art. For example, 
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the polypeptide may be recovered from the nutrient medium by conventional procedures 
including, but not limited to, centrifugation, filtration, extraction, spray-drying, evaporation, or 
precipitation. 

The polypeptides of the present invention may be purified by a variety of procedures 
5 known in the art including, but not limited to, chromatography (e.g., ion exchange, affinity, 
hydrophobic, chromatofocusing, and size exclusion), electrophoretic procedures (e.g., 
preparative isoelectric focusing), differential solubility (e.g., ammonium sulfate precipitation), 
SDS-PAGE, or extraction (see, e.g., Protein Purification, J.-C. Janson and Lars Ryden, 
editors, VCH Publishers, New York, 1989). 

10 

Plants 

The present invention also relates to a transgenic plant, plant part, or plant cell which 
has been transformed with a nucleotide sequence encoding a polypeptide having 
cellobiohydrolase I activity of the present invention so as to express and produce the 
15 polypeptide in recoverable quantities. The polypeptide may be recovered from the plant or 
plant part. Alternatively, the plant or plant part containing the recombinant polypeptide may be 
used as such for improving the quality of a food or feed, e.g., improving nutritional value, 
palatability, and rheological properties, or to destroy an antinutritive factor. 

The transgenic plant can be dicotyledonous (a dicot) or monocotyledonous (a monocot). 
20 Examples of monocot plants are grasses, such as meadow grass (blue grass, Poa), forage 
grass such as Festuca, Lolium, temperate grass, such as Agrostis, and cereals, e.g., wheat, 
oats, rye, barley, rice, sorghum, millets, and maize (corn). 

Examples of dicot plants are tobacco, lupins, potato, sugar beet, legumes, such as pea, 
bean and soybean, and cruciferous plants (family Brassicaceae), such as cauliflower, rape, 
25 canola, and the closely related model organism Arabidopsis thaliana. 

Examples of plant parts are stem, callus, leaves, root, fruits, seeds, and tubers. Also 
specific plant tissues, such as chloroplast, apoplast, mitochondria, vacuole, peroxisomes, and 
cytoplasm are considered to be a plant part. Furthermore, any plant cell, whatever the tissue 
origin, is considered to be a plant part. 
30 Also included within the scope of the present invention are the progeny (clonal or seed) 

of such plants, plant parts and plant cells. 

The transgenic plant or plant cell expressing a polypeptide of the present invention may 
be constructed in accordance with methods known in the art. Briefly, the plant or plant cell is 
constructed by incorporating one or more expression constructs encoding a polypeptide of the 
35 present invention into the plant host genome and propagating the resulting modified plant or 
plant cell into a transgenic plant or plant cell. 

Conveniently, the expression construct is a nucleic acid construct which comprises a 
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nucleotide sequence encoding a polypeptide of the present invention operably linked with 
^appropriate regulatory sequences required for expression of the nucleotide sequence in the 
plant or plant part of choice. Furthermore, the expression construct may comprise a 
selectable marker useful for identifying host cells into which the expression construct has been 
5 integrated and DNA sequences necessary for introduction of the construct into the plant in 
question (the latter depends on the DNA introduction method to be used). 

The choice of regulatory sequences, such as promoter and terminator sequences and 
optionally signal or transit sequences is determined, for example, on the basis of when, where, 
and how the polypeptide is desired to be expressed. For instance, the expression of the gene 

10 encoding a polypeptide of the present invention may be constitutive or inducible, or may be 
developmental, stage or tissue specific, and the gene product may be targeted to a specific 
tissue or plant part such as seeds or leaves. Regulatory sequences are, for example, 
described by Tague et a/., 1988, Plant Physiology 86: 506. 

For constitutive expression, the 35S-CaMV promoter may be used (Franck et a/., 1980, 

15 Cell 21: 285-294). Organ-specific promoters may be, for example, a promoter from storage 
sink tissues such as seeds, potato tubers, and fruits (Edwards & Coruzzi, 1990, Ann. Rev. 
Genet 24: 275-303), or from metabolic sink tissues such as meristems (Ito et a/., 1994, Plant 
Mol. Biol. 24: 863-878), a seed specific promoter such as the glutelin, prolamin, globulin, or 
albumin promoter from rice (Wu et a/., 1998, Plant and Cell Physiology 39: 885-889), a Vicia 

20 faba promoter from the legumin B4 and the unknown seed protein gene from Vicia faba 
(Conrad et at., 1998, Journal of Plant Physiology 152: 708-711), a promoter from a seed oil 
body protein (Chen et a/., 1998, Plant and Cell Physiology 39: 935-941), the storage protein 
napA promoter from Brassica napus, or any other seed specific promoter known in the art, 
e.g., as described in WO 91/14772. Furthermore, the promoter may be a leaf specific 

25 promoter such as the rbcs promoter from rice or tomato (Kyozuka et a/., 1993, Plant 
Physiology 102: 991-1000, the chlorella virus adenine methyltransferase gene promoter (Mitra 
and Higgins, 1994, Plant Molecular Biology 26: 85-93), or the aldP gene promoter from rice 
(Kagaya et a/., 1995, Molecular and General Genetics 248: 668-674), or a wound inducible 
promoter such as the potato pin2 promoter (Xu et ai, 1993, Plant Molecular Biology 22: 573- 

30 588). 

A promoter enhancer element may also be used to achieve higher expression of the 
enzyme in the plant. For instance, the promoter enhancer element may be an intron which is 
placed between the promoter and the nucleotide sequence encoding a polypeptide of the 
present invention. For instance, Xu et a/., 1993, supra disclose the use of the first intron of the 
35 rice actin 1 gene to enhance expression. 

The selectable marker gene and any other parts of the expression construct may be 
chosen from those available in the art. 
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The nucleic acid construct is incorporated into the plant genome according to 
conventional techniques known in the art, including /Agrobacter/um-mediated transformation, 
virus-mediated transformation, microinjection, particle bombardment, biolistic transformation, 
and electroporation (Gasser et ai, 1990, Science 244: 1293; Potrykus, 1990, Bio/Technology 
5 8: 535; Shimamoto et a/., 1989, Nature 338: 274). 

Presently, Agrobacterium tumefaciens-rr\ed\ated gene transfer is the method of choice 
for generating transgenic dicots (for a review, see Hooykas and Schilperoort, 1992, Plant 
Molecular Biology 19: 15-38). However it can also be used for transforming monocots, 
although other transformation methods are generally preferred for these plants. Presently, the 
10 method of choice for generating transgenic monocots is particle bombardment (microscopic 
gold or tungsten particles coated with the transforming DNA) of embryonic calli or developing 
embryos (Christou, 1992, Plant Journal 2: 275-281; Shimamoto, 1994, Current Opinion 
Biotechnology 5: 158-162; Vasil et ai, 1992, Bio/Technology 10: 667-674). An alternative 
method for transformation of monocots is based on protoplast transformation as described by 
15 Omirulleh et a/., 1993, Plant Molecular Biology 21 : 415-428. 

Following transformation, the transformants having incorporated therein the expression 
construct are selected and regenerated into whole plants according to methods well-known in 
the art. 

The present invention also relates to methods for producing a polypeptide of the present 
20 invention comprising (a) cultivating a transgenic plant or a plant cell comprising a nucleotide 
sequence encoding a polypeptide having cellobiohydrolase I activity of the present invention 
under conditions conducive for production of the polypeptide; and (b) recovering the 
polypeptide. 

The present invention also relates to methods for in-situ production of a polypeptide of 
25 the present invention comprising (a) cultivating a transgenic plant or a plant cell comprising a 
nucleotide sequence encoding a polypeptide having cellobiohydrolase I activity of the present 
invention under conditions conducive for production of the polypeptide; and (b) contacting the 
polypeptide with a desired substrate, such as a cellulosic substrate, without prior recovery of 
the polypeptide. 

30 

Compositions 

In a still further aspect, the present invention relates to compositions comprising a 
polypeptide of the present invention. 

The composition may comprise a polypeptide of the invention as the major enzymatic 
35 component, e.g., a mono-component composition. Alternatively, the composition may 
comprise multiple enzymatic activities, such as an aminopeptidase, amylase, carbohydrase, 
carboxypeptidase, catalase, cellulase, chitinase, cutinase, cyclodextrin glycosyltransferase, 
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deoxyribonuclease, esterase, alpha-galactosidase, beta-galactosidase, glucoamylase, alpha- 
glucosidase, beta-glucosidase, haloperoxidase, invertase, laccase, lipase, mannosidase, 
oxidase, pectinolytic enzyme, peptidoglutaminase, peroxidase, phytase, polyphenoloxidase, 
proteolytic enzyme, ribonuclease, transglutaminase, or xylanase. 
5 The compositions may be prepared in accordance with methods known in the art and 

may be in the form of a liquid or a dry composition. For instance, the polypeptide composition 
may be in the form of a granulate or a microgranulate. The polypeptide to be included in the 
composition may be stabilized in accordance with methods known in the art. 

Examples are given below of preferred uses of the polypeptide compositions of the 
10 invention. The dosage of the polypeptide composition of the invention and other conditions 
under which the composition is used may be determined on the basis of methods known in the 
art. 



Detergent Compositions 

15 The polypeptide of the invention may be added to and thus become a component of a 

detergent composition. 

The detergent composition of the invention may for example be formulated as a hand or 
machine laundry detergent composition including a laundry additive composition suitable for 
pre-treatment of stained fabrics and a rinse added fabric softener composition, or be 

20 formulated as a detergent composition for use in general household hard surface cleaning 
operations, or be formulated for hand or machine dishwashing operations. 

In a specific aspect, the invention provides a detergent additive comprising the 
polypeptide of the invention. The detergent additive as well as the detergent composition may 
comprise one or more other enzymes such as a protease, a lipase, a cutinase, an amylase, a 

25 carbohydrase, a cellulase, a pectinase, a mannanase, an arabinase, a galactanase, a 
xylanase, an oxidase, e.g., a laccase, and/or a peroxidase. 

In general the properties of the chosen enzyme(s) should be compatible with the 
selected detergent, (i.e. pH-optimum, compatibility with other enzymatic and non-enzymatic 
ingredients, etc.), and the enzyme(s) should be present in effective amounts. 

30 Proteases : Suitable proteases include those of animal, vegetable or microbial origin. Microbial 
origin is preferred. Chemically modified or protein engineered mutants are included. The 
protease may be a serine protease or a metallo protease, preferably an alkaline microbial 
protease or a trypsin-like protease. Examples of alkaline proteases are subtilisins, especially 
those derived from Bacillus, e.g., subtilisin Novo, subtilisin Carlsberg, subtilisin 309, subtilisin 

35 147 and subtilisin 168 (described in WO 89/06279). Examples of trypsin-like proteases are 
trypsin (e.g. of porcine or bovine origin) and the Fusarium protease described in WO 89/06270 
and WO 94/25583. 
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Examples of useful proteases are the variants described in WO 92/19729, WO 
98/20115, WO 98/20116, and WO 98/34946, especially the variants with substitutions in one 
or more of the following positions: 27, 36, 57, 76, 87, 97, 101, 104, 120, 123, 167, 170, 194, 
206, 218, 222, 224, 235 and 274. 
5 Lipases : Suitable lipases include those of bacterial or fungal origin. Chemically modified or 
protein engineered mutants are included. Examples of useful lipases include lipases from 
Humicola (synonym Thermomyces), e.g. from H. lanuginosa (T. lanuginosus) as described in 
EP 258 068 and EP 305 216 or from H. insolens as described in WO 96/13580, a 
Pseudomonas lipase, e.g. from P. alcaligenes or P. pseudoalcaligenes (EP 218 272), P 

10 cepacia (EP 331 376), P. stutzeri (GB 1,372,034), P fluorescens, Pseudomonas sp. strain SD 
705 (WO 95/06720 and WO 96/27002), P. wisconsinensis (WO 96/12012), a Bacillus lipase, 
e.g. from B. subtilis (Dartois et al. (1993), Biochemica et Biophysica Acta, 1131, 253-360), B. 
stearothermophilus (JP 64/744992) or B. pumilus (WO 91/16422). 

Other examples are lipase variants such as those described in WO 92/05249, WO 

15 94/01541, EP 407 225, EP 260 105, WO 95/35381, WO 96/00292, WO 95/30744, WO 
94/25578, WO 95/14783, WO 95/22615, WO 97/04079 and WO 97/07202. 
Amylases: Suitable amylases (alpha and/or beta) include those of bacterial or fungal origin. 
Chemically modified or protein engineered mutants are included. Amylases include, for 
example, alpha-amylases obtained from Bacillus, e.g. a special strain of B. licheniformis, 

20 described in more detail in GB 1,296,839. 

Examples of useful amylases are the variants described in WO 94/02597, WO 
94/18314, WO 96/23873, and WO 97/43424, especially the variants with substitutions in one 
or more of the following positions: 15, 23, 105, 106, 124, 128, 133, 154, 156, 181, 188, 190, 
197, 202, 208, 209, 243, 264, 304, 305, 391, 408, and 444. 

25 Cellulases : Suitable cellulases include those of bacterial or fungal origin. Chemically modified 
or protein engineered mutants are included. Suitable cellulases include cellulases from the 
genera Bacillus, Pseudomonas, Humicola, Fusarium, Thielavia, Acremonium, e.g. the fungal 
cellulases produced from Humicola insolens, Myceliophthora thermophila and Fusarium 
oxysporum disclosed in US 4,435,307, US 5,648,263, US 5,691,178, US 5,776,757 and WO 

30 89/09259. 

Especially suitable cellulases are the alkaline or neutral cellulases having colour care 
benefits. Examples of such cellulases are cellulases described in EP 0 495 257, EP 0 531 
372, WO 96/1 1262, WO 96/29397, WO 98/08940. Other examples are cellulase variants such 
as those described in WO 94/07998, EP 0 531 315, US 5,457,046, US 5,686,593, US 
35 5,763,254, WO 95/24471, WO 98/12307 and PCT/DK98/00299. 

Peroxidases/Oxidases: Suitable peroxidases/oxidases include those of plant, bacterial or 
fungal origin. Chemically modified or protein engineered mutants are included. Examples of 
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useful peroxidases include peroxidases from Coprinus, e.g. from C. cinereus, and variants 
thereof as those described in WO 93/24618, WO 95/10602, and WO 98/15257. 

The detergent enzyme(s) may be included in a detergent composition by adding 
separate additives containing one or more enzymes, or by adding a combined additive 
5 comprising all of these enzymes. A detergent additive of the invention, i.e. a separate additive 
or a combined additive, can be formulated e.g. as a granulate, a liquid, a slurry, etc. Preferred 
detergent additive formulations are granulates, in particular non-dusting granulates, liquids, in 
particular stabilized liquids, or slurries. 

Non-dusting granulates may be produced, e.g., as disclosed in US 4,106,991 and 

10 4,661,452 and may optionally be coated by methods known in the art. Examples of waxy 
coating materials are poly(ethylene oxide) products (polyethyleneglycol, PEG) with mean 
molar weights of 1000 to 20000; ethoxylated nonylphenols having from 16 to 50 ethylene 
oxide units; ethoxylated fatty alcohols in which the alcohol contains from 12 to 20 carbon 
atoms and in which there are 15 to 80 ethylene oxide units; fatty alcohols; fatty acids; and 

15 mono- and di- and triglycerides of fatty acids. Examples of film-forming coating materials 
suitable for application by fluid bed techniques are given in GB 1483591. Liquid enzyme pre- 
parations may, for instance, be stabilized by adding a polyol such as propylene glycol, a sugar 
or sugar alcohol, lactic acid or boric acid according to established methods. Protected en- 
zymes may be prepared according to the method disclosed in EP 238,216. 

20 The detergent composition of the invention may be in any convenient form, e.g., a bar, a 

tablet, a powder, a granule, a paste or a liquid. A liquid detergent may be aqueous, typically 
containing up to 70 % water and 0-30 % organic solvent, or non-aqueous. 

The detergent composition comprises one or more surfactants, which may be non-ionic 
including semi-polar and/or anionic and/or cationic and/or zwitterionic. The surfactants are 

25 typically present at a level of from 0.1% to 60% by weight. 

When included therein the detergent will usually contain from about 1% to about 40% of 
an anionic surfactant such as linear alkylbenzenesulfonate, alpha-olefinsulfonate, alkyl sulfate 
(fatty alcohol sulfate), alcohol ethoxysulfate, secondary alkanesulfonate, alpha-sulfo fatty acid 
methyl ester, alkyl- or alkenylsuccinic acid or soap. 

30 When included therein the detergent will usually contain from about 0.2% to about 40% 

of a non-ionic surfactant such as alcohol ethoxylate, nonylphenol ethoxylate, 
alkylpolyglycoside, alkyldimethylamineoxide, ethoxylated fatty acid monoethanolamide, fatty 
acid monoethanolamide, polyhydroxy alkyl fatty acid amide, or N-acyl N-alkyl derivatives of 
glucosamine ("glucamides"). 

35 The detergent may contain 0-65 % of a detergent builder or complexing agent such as 

zeolite, diphosphate, triphosphate, phosphonate, carbonate, citrate, nitrilotriacetic acid, 
ethylenediaminetetraacetic acid, diethylenetriaminepentaacetic acid, alkyl- or alkenylsuccinic 
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acid, soluble silicates or layered silicates (e.g. SKS-6 from Hoechst). 

The detergent may comprise one or more polymers. Examples are 
carboxymethylcellulose, poly(vinylpyrrolidone), poly (ethylene glycol), polyvinyl alcohol), 
poly(vinylpyridine-IM-oxide), poly(vinylimidazole), polycarboxylates such as polyacrylates, 
5 maleic/acrylic acid copolymers and lauryl methacrylate/acrylic acid copolymers. 

The detergent may contain a bleaching system which may comprise a H 2 0 2 source such 
as perborate or percarbonate which may be combined with a peracid-forming bleach activator 
such as tetraacetylethylenediamine or nonanoyloxybenzenesulfonate. Alternatively, the 
bleaching system may comprise peroxyacids of e.g. the amide, imide, or sulfone type. 
10 The enzyme(s) of the detergent composition of the invention may be stabilized using 

conventional stabilizing agents, e.g., a polyol such as propylene glycol or glycerol, a sugar or 
sugar alcohol, lactic acid, boric acid, or a boric acid derivative, e.g., an aromatic borate ester, 
or a phenyl boronic acid derivative such as 4-formylphenyl boronic acid, and the composition 
may be formulated as described in e.g. WO 92/19709 and WO 92/19708. 
15 The detergent may also contain other conventional detergent ingredients such as e.g. 

fabric conditioners including clays, foam boosters, suds suppressors, anti-corrosion agents, 
soil-suspending agents, anti-soil redeposition agents, dyes, bactericides, optical brighteners, 
hydrotropes, tarnish inhibitors, or perfumes. 

It is at present contemplated that in the detergent compositions any enzyme, in particular 
20 the polypeptide of the invention, may be added in an amount corresponding to 0.01-100 mg of 
enzyme protein per liter of wash liquor, preferably 0.05-5 mg of enzyme protein per liter of 
wash liquor, in particular 0.1-1 mg of enzyme protein per liter of wash liquor. 

The polypeptide of the invention may additionally be incorporated in the detergent 
formulations disclosed in WO 97/07202 which is hereby incorporated as reference. 

25 

DNA recombination (shuffling) 

The nucleotide sequences of SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, 
SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID 
NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ 

30 ID NO:31, SEQ ID NO:33, SEQ ID NO:35, SEQ ID NO:37, SEQ ID NO:39, SEQ ID NO:41, 
SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:51, SEQ ID 
NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID NO:59, SEQ ID NO:61, SEQ ID NO:62, SEQ 
ID NO:63, SEQ ID NO:64, SEQ ID NO:65, SEQ ID NO:67 may be used in a DNA 
recombination (or shuffling) process. The new polynucleotide sequences obtained in such a 

35 process may encode new polypeptides having cellobiase activity with improved properties, 
such as improved stability (storage stability, thermostability), improved specific activity, 
improved pH-optimum, and/or improved tolerance towards specific compounds. 
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Shuffling between two or more homologous input polynucleotides (starting-point 
polynucleotides) involves fragmenting the polynucleotides and recombining the fragments, to 
obtain output polynucleotides (i.e. polynucleotides that have been subjected to a shuffling 
cycle) wherein a number of nucleotide fragments are exchanged in comparison to the input 
5 polynucleotides. 

DNA recombination or shuffling may be a (partially) random process in which a library of 
chimeric genes is generated from two or more starting genes. A number of known formats can 
be used to carry out this shuffling or recombination process. 

The process may involve random fragmentation of parental DNA followed by reassembly 
10 by PCR to new full-length genes, e.g. as presented in US5605793, US5811238, US5830721, 
US61 17679. In-vitro recombination of genes may be carried out, e.g. as described in 
US61 59687, W098/41623, US61 59688, US5965408, US61 53510. The recombination process 
may take place in vivo in a living cell, e.g. as described in WO 97/07205 and WO 98/28416. 

The parental DNA may be fragmented by DNA'se I treatment or by restriction 
15 endonuclease digests as descriobed by Kikuchi et al (2000a, Gene 236:159-167). Shuffling of 
two parents may be done by shuffling single stranded parental DNA of the two parents as 
described in Kikuchi et al (2000b, Gene 243:133-137). 

A particular method of shuffling is to follow the methods described in Crameri et al, 
1998, Nature, 391: 288-291 and Ness et al. Nature Biotechnology 17: 893-896. Another format 
20 would be the methods described in US 6159687: Examples 1 and 2. 



Production of Ethanol from Biomass 

The present invention also relates to methods for producing ethanol from biomass, such 
as cellulosic materials, comprising contacting the biomass with the polypeptides of the 

25 invention. Ethanol may subsequently be recovered. The polypeptides of the invention may be 
produced "in-situ", i.e., as part of, or directly in an ethanol production process, by cultivating a 
host cell or a strain, which in its wild-type form is capable of producing the polypeptides, under 
conditions conducive for production of the polypeptides. 

Ethanol can be produced by enzymatic degradation of biomass and conversion of the 

30 released polysaccharides to ethanol. This kind of ethanol is often referred to as bioethanol or 
biofuel. It can be used as a fuel additive or extender in blends of from less than 1% and up to 
100% (a fuel substitute). In some countries, such as Brazil, ethanol is substituting gasoline to 
a very large extent. 

The predominant polysaccharide in the primary cell wall of biomass is cellulose, the 
35 second most abundant is hemi-cellulose, and the third is pectin. The secondary cell wall, 
produced after the cell has stopped growing, also contains polysaccharides and is 
strengthened through polymeric lignin covalently cross-linked to hemicellulose. Cellulose is a 
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homopolymer of anhydrocellobiose and thus a linear beta-(1-4)-D-glucan, while hemicelluloses 
include a variety of compounds, such as xylans, xyloglucans, arabinoxylans, and mannans in 
complex branched structures with a spectrum of substituents. Although generally 
polymorphous, cellulose is found in plant tissue primarily as an insoluble crystalline matrix of 
5 parallel glucan chains. Hemicelluloses usually hydrogen bond to cellulose, as well as to other 
hemicelluloses, which helps stabilize the cell wall matrix. 

Three major classes of cellulase enzymes are used to breakdown biomass: 

• The "endo-1 ,4-beta-glucanases" or 1,4-beta-D-glucan-4-glucanohydrolases (EC 3.2.1.4), 
which act randomly on soluble and insoluble 1 ,4-beta-glucan substrates. 

10 • The "exo-1,4-beta-D-glucanases" including both the 1 ,4-beta-D-glucan glucohydrolases 
(EC 3.2.1.74), which liberate D-glucose from 1 ,4-beta-D-glucans and hydrolyze D- 
cellobiose slowly, and 1 ,4-beta-D-glucan cellobiohydrolase (EC 3.2.1.91), also referred to 
as cellobiohydrolase I, which liberates D-cellobiose from 1,4-beta-glucans. 

• The "beta-D-glucosidases" or beta-D-glucoside glucohydrolases (EC 3.2.1.21), which act 
15 to release D-glucose units from cellobiose and soluble cellodextrins, as well as an array of 

glycosides. 

These three classes of enzymes work together synergistically in a complex interplay that 
results in efficient decrystallization and hydrolysis of native cellulose from biomass to yield the 
20 reducing sugars which are converted to ethanol by fermentation. 

The present invention is further described by the following examples which should not be 
construed as limiting the scope of the invention. 

25 EXAMPLES 

Chemicals used as buffers and substrates were commercial products of at least reagent 
grade. 

EXAMPLE 1 

30 Cloning of a partial and a full-length cellobiohydrolase I (CBH1) DNA sequence 

A cDNA library of Diplodia gossypina was PCR screened for presence of the CBH1 gene. For 
this purpose sets of primers were constructed, based on sequence alignment and 
identification of conserved regions among CBH1 proteins. The PCR band from a gel 
35 electrophoresis was used to obtain a partial sequence of the CBH1 gene from Diplodia 
gossypina. Homology search confirmed that the partial sequence was a partial sequence of 
the CBH1 gene (EC 3.2.1.91). 
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The full-length CBH1 gene of Diplodia gossypina is obtained by accessing the patent deposit 
CBS 247.96, make a DNA or cDNA preparation, use the partial sequence as basis for 
construction of specific primers, and use standard PCR cloning techniques to step by step 
5 getting the entire gene. 

Several other approaches can be taken: 

• PCR screening of the cDNA library or the cDNAs that were used for the construction of 
10 the library, could be performed. To do so, Gene Specific Primers (GSP) and 

vector/adaptor primers are constructed from the partial cDNA sequence of the CBH1 gene 
and from vector/adaptor sequence respectively; both sets of primers designed to go 
outward into the missing 5' and 3' regions of the CBH1 cDNA. The longest PCR products 
obtained using combinations of GSP and vector/adaptor primer represent the full-length 5' 
15 and 3' end regions of the CBH1 cDNA from Diplodia gossypina. Homology search and 

comparison with the partial cDNA sequence confirm that the 5' and 3' PCR products 
belong to the same CBH1 cDNA from Diplodia gossypina. The full-length cDNA can then 
be obtained by PCR using a set of primers constructed from both the 5'and 3' ends. 

20 • Alternatively, the cDNA library could be screened for the full-length cDNA using standard 
hybridization techniques and the partial cDNA sequence as a probe. The clones giving a 
positive hybridization signal with the probe are then purified and sequenced to determine 
the longest cDNA sequence. Homology search and comparison confirms that the full- 
length cDNA correspond to the partial CBH1 cDNA sequence that was originally used as a 

25 probe. 

The two approaches described above rely on the presence of the full-length CBH1 cDNA in 
the cDNA library or in the cDNAs used for its construction. Alternatively, the 5' and 3' RACE 
(Rapid Amplification of cDNA Ends) techniques or derived techniques could be used to identify 
30 the missing 5' and 3' regions. For this purpose, preferably mRNAs from Diplodia gossipina are 
isolated and utilized to synthesize first strand cDNAs using oligo(dT)- containing Adapter 
Primer or a 5 - Gene Specific Primer (GSP). 

The full-length cDNA of the CBH1 gene from Diplodia gossypina can also be obtained by 
35 using genomic DNA from Diplodia gossypina. The CBH1 gene can be identified by PCR 
techniques such as the one describe above or by standard genomic library screening using 
hybridization techniques and the partial CBH1 cDNA as a probe. Homology search and 
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comparison with the partial CBH1 cDNA confirms that the genomic sequence correspond to 
the CBH1 gene from Diplodia gossypina. Identification of consensus sequences such as 
initiation site of transcription, start and stop codons or polyA sites could be used to defined the 
region comprising the full-length cDNA. Primers constructed from both the 5' and 3' ends of 
5 this region could then be used to amplify the full-length cDNA from mRNA or cDNA library 
from Diplodia gossypina (see above). 

By expression of the full-length gene in a suitable expression host construct the CBH1 enzyme 
is harvested as an intra cellular or extra cellular enzyme from the culture broth. 

10 

The methods described above apply to the cloning of cellobiohydrolase I DNA sequences from 
all organisms and not only Diplodia gossypina. 

15 EXAMPLE 2 

Cellobiohydrolase I (CBH I) Activity 

A cellobiohydrolase I is characterized by the ability to hydrolyze highly crystalline 
cellulose very efficiently compared to other cellulases. Cellobiohydrolase I may have a higher 
20 catalytic activity using PASC (phosphoric acid swollen cellulose) as substrate than using CMC 
as substrate. For the purposes of the present invention, any of the following assays can be 
used to identify a cellobiohydrolase I: 

Activity on Azo-Avicel 

25 Azo-Avicel (Megazyme, Bray Business Park, Bray, Wicklow, Ireland) was used according to 
the manufacturers instructions. 

Activity on PNP-beta-cellobiose 

Substrate solution: 5 mM PNP beta-D-Cellobiose (p-Nitrophenyl p-d-Cellobioside Sigma N- 
30 5759) in 0.1 M Na-acetate buffer, pH 5.0; 

Stop reagent: 0.1 M Na-carbonate, pH 11.5. 

50 pL CBH I solution was mixed with 1 mL substrate solution and incubated 20 minutes 
at 40°C. The reaction was stopped by addition of 5 mL stop reagent. Absorbance was 
measured at 404 nm. 

35 

Activity on PASC and CMC 

The substrate is degraded with cellobiohydrolase I (CBH I) to form reducing sugars. A 
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Microdochium nivale carbohydrate oxidase (rMnO) or another equivalent oxidase acts on the 
reducing sugars to form H 2 0 2 in the presence of 0 2 . The formed H 2 0 2 activates in the 
presence of excess peroxidase the oxidative condensation of 4-aminoantipyrine (AA) and N- 
ethyl-N-sulfopropyl-m-toluidine (TOPS) to form a purple product which can be quantified by its 
5 absorbance at 550 nm. 

When all components except CBH I are in surplus, the rate of increase in absorbance is 
proportional to the CBH I activity. The reaction is a one-kinetic-step reaction and may be 
carried out automatically in a Cobas Fara centrifugal analyzer (Hoffmann La Roche) or 
another equivalent spectrophotometer which can measure steady state kinetics. 

10 

Buffer: 50 mM Na-acetate buffer (pH 5.0); 

Reagents: rMnO oxidase, purified Microdochium nivale carbohydrate oxidase, 2 mg/L (final 

concentration); 

Peroxidase, SIGMA P-8125 (96 U/mg), 25 mg/L (final concentration); 
15 4-aminoantipyrine, SIGMA A-4382, 200 mg/L (final concentration); 

TOPS, SIGMA E-8506, 600 mg/L (final concentration); 
PASC or CMC (see below), 5 g/L (final concentration). 
All reagents were added to the buffer in the concentrations indicated above and this 
reagent solution was mixed thoroughly. 
20 50 mL cellobiohydrolase I sample (in a suitable dilution) was mixed with 300 |jL reagent 

solution and incubated 20 minutes at 40°C. Purple color formation was detected and 
measured as absorbance at 550 nm. 

The AA/TOPS-condensate absorption coefficient is 0.01935 A 550 /(mM cm). The rate is 
calculated as M m o' e s reducing sugar produced per minute from OD 55 o/minute and the 
25 absorption coefficient. 



PASC: 

Materials: 5 g Avicel® (Art. 2331 Merck); 

150 mL 85% Ortho-phosphoric-acid (Art. 573 Merck); 
30 800 mL Acetone (Art. 14 Merck); 

Approx. 2 liter deionized water (Milli-Q); 
1 L glass beaker; 

1 L glass filter funnel; 

2 L suction flask; 

35 Ultra Turrax Homogenizer. 

Acetone and ortho-phosphoric-acid is cooled on ice. Avicel® is moisted with water, and 
then the 150 mL icecold 85% Ortho-phosphoric-acid is added. The mixture is placed on an 
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icebath with weak stirring for one hour. 

Add 500 mL ice-cold acetone with stirring, and transfer the mixture to a glass filter funnel 
and wash with 3 x 100 mL ice-cold acetone, suck as dry as possible in each wash. Wash with 
2 x 500 mL water (or until there is no odor of acetone), suck as dry as possible in each wash. 
5 Re-suspend the solids in water to a total volume of 500 mL, and blend to homogeneity 

using an Ultra Turrax Homogenizer. Store wet in refrigerator and equilibrate with buffer by 
centrifugation and re-suspension before use. 

CMC: 

10 Bacterial cellulose microfibrils in an impure form was obtained from the Japanese 

foodstuff "nata de coco" (Fujico Company, Japan). The cellulose in 350 g of this product was 
purified by suspension of the product in about 4 L of tap water. This water was replaced by 
fresh water twice a day for 4 days. 

Then 1% (w/v) NaOH was used instead of water and the product was re-suspended in 

15 the alkali solution twice a day for 4 days. Neutralisation was done by rinsing the purified 
cellulose with distilled water until the pH at the surface of the product was neutral (pH 7). 

The cellulose was microfibrillated and a suspension of individual bacterial cellulose 
microfibrils was obtained by homogenisation of the purified cellulose microfibrils in a Waring 
blender for 30 min. The cellulose microfibrils were further purified by dialysing this suspension 

20 through a pore membrane against distilled water and the isolated and purified cellulose 
microfibrils were stored in a water suspension at 4°C. 

Deposit of Biological Material 

25 

China General Microbiological Culture Collection Center (CGMCC) 

The following biological material has been deposited under the terms of the Budapest 
Treaty with the China General Microbiological Culture Collection Center (CGMCC), Institute of 
Microbiology, Chinese Academy of Sciences, Haidian, Beijing 100080, China: 

30 

Accession Number: 
Applicants reference: 
Date of Deposit: 
Description: 
35 Classification: 
Origin: 

Related sequence(s): 
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CGMCC No. 0584 

ND000575 
2001-05-29 

Acremonium thermophilum CBH I gene on plasmid 
Ascomycota\ Sordariomycetes] Hypocrales] Hypocreaceae 
China, 1999 

SEQ ID NO:1 and SEQ ID NO:2 (DNA sequence encoding a 
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cellobiohydrolase I from Acremonium thermophilum and the 
corresponding protein sequence) 



10 



Accession Number: 
Applicants reference: 
Date of Deposit: 
Description: 
Classification: 
Origin: 

Related sequence(s): 



CGMCC No. 0581 

ND000548 
2001-05-29 

Chaetomium thermophilum CBH I gene on plasmid 
Ascomycota] Sordariomycetes; Sordariales\ Chaetomiaceae 
China, 1999 

SEQ ID NO:3 and SEQ ID NO:4 (DNA sequence encoding a 
cellobiohydrolase I from Chaetomium thermophilum and the 
corresponding protein sequence) 



Accession Number: 
15 Applicants reference: 
Date of Deposit: 
Description: 
Classification: 
Origin: 

20 Related sequence(s): 



CGMCC No. 0585 

ND001223 
2001-05-29 

Scytalidium sp. CBH I gene on plasmid 
Ascomycota; Mitosporic 
China, 1999 

SEQ ID NO:5 and SEQ ID NO:6 (DNA sequence encoding a 
cellobiohydrolase I from Scytalidium sp. and the corresponding 
protein sequence) 



Accession Number: 
25 Applicants reference: 
Date of Deposit: 
Description: 
Classification: 
Origin: 

30 Related sequence(s): 



CGMCC No. 0582 

ND000549 
2001-05-29 

Thermoascus aurantiacus CBH I gene on plasmid 
Eurotiomycetes; Eurotiales\ Trichocomaceae 
China 

SEQ ID NO:7 and SEQ ID NO:8 (DNA sequence encoding a 
cellobiohydrolase I from Thermoascus aurantiacus and the 
corresponding protein sequence) 



Accession Number: 
35 Applicants reference: 
Date of Deposit: 
Description: 



CGMCC No. 0583 

ND001182 
2001-05-29 

Thielavia australiensis CBH I gene on plasmid 



64 



WO 03/000941 

Classification: 
Origin: 

Related sequence(s): 



PCT/DK02/00429 

Ascomycota; Sordariomycetes\ Sordariales] Chaetomiaceae 
China, 1998 

SEQ ID NO:9 and SEQ ID NO: 10 (DNA sequence encoding a 
cellobiohydrolase I from Thielavia australiensis and the 
corresponding protein sequence) 



10 



Accession Number: 
Applicants reference: 
Date of Deposit: 
Description: 
Classification: 
Origin: 

Related sequence(s): 



15 



CGMCC No. 0580 

ND000562 
2001-05-29 

Melanocarpus albomyces CBH I gene on plasmid 
Ascomycota\ Sordariomycetes] Sordariales 
China, 1999 

SEQ ID NO:15 and SEQ ID NO:16 (DNA sequence encoding a 
cellobiohydrolase I from Melanocarpus albomyces and the 
corresponding protein sequence) 



Accession Number: 
Applicants reference: 
Date of Deposit: 
20 Description: 
Classification: 
Origin: 

Related sequence(s): 



CGMCC No. 0748 

ND001181 
2002-06-07 

Acremonium sp. CBH I gene on plasmid 
mitosporic Ascomycetes 
China, 2000 

SEQ ID NO:53 and SEQ ID NO:54 



25 Accession Number: 
Applicants reference: 
Date of Deposit: 
Description: 
Classification: 

30 Origin: 

Related sequence(s): 



CGMCC No. 0749 

ND000577 
2002-06-07 

Chaetomidium pingtungium CBH I gene on plasmid 
Chaetomiaceae, Sordariales, Ascomycota 
China, 2000 

SEQ ID NO:55 and SEQ ID NO:56 



Accession Number: 
Applicants reference: 
35 Date of Deposit: 
Description: 
Classification: 



CGMCC No. 0747 

ND001175 
2002-06-07 

Sporotrichum pruinosum CBH I gene on plasmid 
Meruliaceae, Stereales, Basidiomycota 
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WO 03/000941 

Origin: 

Related sequence(s): 



China, 2000 



SEQ ID NO:57 and SEQ ID NO:58 



PCT/DK02/00429 



10 



Accession Number: 
Applicants reference: 
Date of Deposit: 
Description: 
Classification: 
Origin: 

Related sequence(s): 



CGMCC No. 0750 

ND000571 
2002-06-07 

Scytalidium thermophilum CBH I gene on plasmid 
Ascomycota; Mitosporic 
China, 2000 

SEQ ID NO:59 and SEQ ID NO:60 



15 



Centraalbureau Voor Schimmelcultures (CBS) 

The following biological material has been deposited under the terms of the Budapest 
Treaty with the Centraalbureau Voor Schimmelcultures (CBS), Uppsalalaan 8, 3584 CT 
Utrecht, The Netherlands (alternatively P.O.Box 85167, 3508 AD Utrecht, The Netherlands): 



20 



25 



Accession Number: 
Applicants reference: 
Date of Deposit: 
Description: 
Classification: 
Origin: 

Related sequence(s): 



CBS 109513 

ND000538 
2001-06-01 
Verticillium tenerum 

Ascomycota, Hypocreales, Pyrenomycetes (mitosporic) 

SEQ ID NO:11 and SEQ ID NO:12 (DNA sequence encoding a 
cellobiohydrolase I from Verticillium tenerum and the corresponding 
protein sequence) 



30 



Accession Number: 
Applicants reference: 
Date of Deposit: 
Description: 
Classification: 
Origin: 

Related sequence(s): 



35 



CBS 819-73 

ND000533 

Publicly available (not deposited by applicant) 
Humicola nigrescens 

Sordariaceae, Sordariales, Sordariomycetes] Ascomycota 

SEQ ID NO: 18 (partial DNA sequence encoding a cellobiohydrolase 
I from Humicola nigrescens) 



Accession Number: 



CBS 427.97 
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WO 03/000941 

Applicants reference: 
Date of Deposit: 
Description: 
Classification: 
Origin: 

Related sequence(s): 



PCT/DK02/00429 

ND000530 
1997-01-23 

Cladorrhinum foecundissimum 

Sordariaceae, Sordariales, Sordariomycetes] Ascomycota 
Jamaica 

SEQ ID NO: 19 (partial DNA sequence encoding a cellobiohydrolase 
I from Cladorrhinum foecundissimum) 



10 



15 



20 



25 



Accession Number: 
Applicants reference: 
Date of Deposit: 
Description: 
Classification: 
Origin: 

Related sequence(s): 



Accession Number: 
Applicants reference: 
Date of Deposit: 
Description: 
Classification: 
Origin: 

Related sequence(s): 



CBS 247.96 

ND000534 and ND001231 

1996-03-12 

Diplodia gossypina 

Dothideaceae, Dothideales, Dothidemycetes\ Ascomycota 
Indonesia, 1992 

SEQ ID NO:20 (partial DNA sequence encoding a cellobiohydrolase 
I from Diplodia gossypina), SEQ ID NO:37 (full DNA sequence 
encoding a cellobiohydrolase I from Diplodia gossypina) and SEQ 
ID NO:38 (full cellobiohydrolase I protein sequence from Diplodia 
gossypina) 

CBS 117.65 

ND000536 
Publicly available 
Myceliophthora thermophila 

Sordariaceae, Sordariales, Sordariomycetes] Ascomycota 

SEQ ID NO:21 (partial DNA sequence encoding a cellobiohydrolase 
I from Myceliophthora thermophila) 



30 Accession Number: 
Applicants reference: 
Date of Deposit: 
Description: 
Classification: 

35 Origin: 

Related sequence(s): 



CBS 109471 

ND000537 

2001-05-29 

Rhizomucor pu sill us 

Mucoraceae, Mucorales, Zygomycota 

Denmark 

SEQ ID NO:22 (partial DNA sequence encoding a cellobiohydrolase 
I from Rhizomucor pusillus) 
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WO 03/000941 



PCT/DK02/00429 



Accession Number: 
Applicants reference: 
Date of Deposit: 
Description: 
Classification: 
Origin: 

Related sequence(s): 



CBS 521.95 

ND000542 
1 995-07-04 

Meripilus giganteus 

Rigidiporaceae, Hymenomycetales, Basidiomycota 
Denmark, 1993 

SEQ ID NO:23 (partial DNA sequence encoding a cellobiohydrolase 
I from Meripilus giganteus) 



Accession Number: 
Applicants reference: 
Date of Deposit: 
Description: 
Classification: 
Origin: 

Related sequence(s): 



CBS 277.96 

ND000543, ND001346 and ND001243 

1996-03-12 

Exidia glandulosa 

Exidiaceae, Auriculariales, Hymenomycetes, Basidiomycota 
Denmark, 1993 

SEQ ID NO:24 (partial DNA sequence encoding a cellobiohydrolase 
I from Exidia glandulosa), SEQ ID NO:45 (full DNA sequence 
encoding a cellobiohydrolase I with CBD from Exidia glandulosa), 
SEQ ID NO:46 (full cellobiohydrolase I protein sequence with CBD 
from Exidia glandulosa), SEQ ID NO:47 (full DNA sequence 
encoding a cellobiohydrolase I from Exidia glandulosa) and SEQ ID 
NO:48 (full cellobiohydrolase I protein sequence from Exidia 
glandulosa) 



Accession Number: 
Applicants reference: 
Date of Deposit: 
Description: 
Classification: 
Origin: 

Related sequence(s): 



CBS 284.96 

ND000544 and ND001235 

1996-03-12 

Xylaria hypoxylon 

Sordariaceae, Sordariales, Sordariomycetes, Ascomycota 
Denmark, 1993 

SEQ ID NO:25 (partial DNA sequence encoding a cellobiohydrolase 
I from Xylaria hypoxylon), SEQ ID NO:43 (full DNA sequence 
encoding a cellobiohydrolase I from Xylaria hypoxylon) and SEQ ID 
NO:44 (full cellobiohydrolase I protein sequence from Xylaria 
hypoxylon) 
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WO 03/000941 

Accession Number: 
Applicants reference: 
Date of Deposit: 
Description: 
Classification: 
Related sequence(s): 



PCT/DK02/00429 

CBS 804.70 

ND001227 
Publicly available 
Trichophaea saccata 

Ascomycota] Pezizomycetes] Pezizales: Pyronemataceae 

SEQ ID NO:36 (partial DNA sequence encoding a cellobiohydrolase 

I from Trichophaea saccata) 



10 Deutsche Sammlung von Mikroorqanismen und Zellkulturen GmbH (DSMZ) 

The following biological material has been deposited under the terms of the Budapest 
Treaty with the Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH (DSMZ), 
Mascheroder Weg 1b, 38124 Braunschweig, Germany: 



15 Accession Number: 
Applicants reference: 
Date of Deposit: 
Description: 
Classification: 

20 Origin: 

Related sequence(s): 



DSM 14348 

ND000551 
2001-06-13 

Neotermes castaneus, termite CBH I gene on plasmid 

Cultures of termite larvae bought from BAM, Germany, 1999 
SEQ ID NO:13 and SEQ ID NO:14 (DNA sequence encoding a 
cellobiohydrolase I from gut cells or microbes from the gut of 
Neotermes castaneus and the corresponding protein sequence) 



25 Accession Number: 
Applicants reference: 
Date of Deposit: 
Description: 
Classification: 

30 Origin: 

Related sequence(s): 



DSM 15066 

ND001349 
2002-06-21 

Poitrasia circinans CBH I gene on plasmid 
Choanephoraceae, Zygomycota, Mucorales 

SEQ ID NO:49 (DNA sequence encoding a cellobiohydrolase I from 
Poitrasia circinans) and SEQ ID NO:50 (cellobiohydrolase I protein 
sequence from Poitrasia circinans) 



35 Accession Number: 
Applicants reference: 
Date of Deposit: 



DSM 15065 

ND001339 
2002-06-21 
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WO 03/000941 



PCT/DK02/00429 



Description: 

Classification: 

Origin: 

Related sequence(s): 



Coprinus cinereus CBH I gene on plasmid 
Basidiomycota, Hymenomycetes; Agaricales, Agaricaceae 
Denmark 

SEQ ID NO:51 (DNA sequence encoding a cellobiohydrolase I from 
Coprinus cinereus) and SEQ ID NO: 52 (cellobiohydrolase I protein 
sequence from Coprinus cinereus) 



Accession Number: 
Applicants reference: 
Date of Deposit: 
Description: 
Classification: 
Origin: 

Related sequence(s): 



DSM 15064 

ND001264 
2002-06-21 

Trichophaea saccata CBH I gene on plasmid 
Ascomycota\ Pezizomycetes\ Pezizales\ Pyronemataceae 

SEQ ID NO:39 (DNA sequence encoding a cellobiohydrolase I from 
Trichophaea saccata) and SEQ ID NO:40 (cellobiohydrolase I 
protein sequence from Trichophaea saccata) 



Accession Number: 
Applicants reference: 
Date of Deposit: 
Description: 
Classification: 
Origin: 

Related sequence(s): 



DSM 15067 

ND001232 
2002-06-21 

Myceliophthora thermophila CBH I gene on plasmid 
Sordariaceae, Sordariales, Sordariomycetes\ Ascomycota 

SEQ ID NO:41 (DNA sequence encoding a cellobiohydrolase I from 
Myceliophthora thermophila) and SEQ ID NO:42 (cellobiohydrolase I 
protein sequence from Myceliophthora thermophila) 



Institute for Fermentation. Osaka (\FO) 

The following biological material has been deposited under the terms of the Budapest 
Treaty with the Institute for Fermentation, Osaka (IFO), 17-85, Juso-honmachi 2-chome, 
Yodogawa-ku, Osaka 532-8686, Japan: 

Accession Number: IFO 5372 

Applicants reference: ND000531 

Date of Deposit: Publicly available (not deposited by applicant) 

Description: Trichothecium roseum 



70 



WO 03/000941 PCT/DK02/00429 

Classification: mitosporic Ascomycetes 

Origin: 

Related sequence(s): SEQ ID NO: 17 (partial DNA sequence encoding a cellobiohydrolase 

I from Trichothecium roseum) 



The deposit of CBS 427.97, CBS 247.96, CBS 521 .95, CBS 284.96, CBS 274.96 were made 
by Novo Nordisk A/S and were later assigned to Novozymes A/S. 
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0-1 

0-1-1 


Form - PCT/RO/134 (EASY) 
Indications Relating to Deposited 
Microorganism(s) or Other Biological 

1 /nPT Quia H')mr\ 

Material \r\* t Kuie i jdisj 
Prepared using 


PCT-EASY Version 2.92 
(updated 01. 06 .2002) 


0-2 


International Application No. 




0-3 


Applicant's or agent's file reference 


10129. 204-WO 




1 

1-1 
1-2 


The indications made below relate to 
the deposited microorganism(s) or 
other biological material referred to 
in the description on: 

Daae 

line 


o 6 - 54 
31-2 


1-3 

1-3-1 

1-3-2 

1-3-3 
1-3-4 


Identification of Deposit 
Name of depositary institution 

Address of depositary institution 

Date of deposit 
Accession Number 


China General Microbiological Culture 
Collection Center 

China Committee for Culture Collection 
of Microorganisms, P.O. Box 2714, 
Beijing 100080, China 
29 May 2001 (29.05.2001) 
CGMCC 0584 


1-4 


Additional Indications 


NONE 


1-5 


Designated States for Which 
Indications are Made 


all designated States 


1-6 


Separate Furnishing of Indications 

These indications will be submitted to 
the International Bureau later 


NONE 


2 

2-1 
2-2 


The indications made below relate to 
the deposited microorganism(s) or 
other biological material referred to 
in the description on: 

line 


64 

4-12 


2-3 
2-3-1 

2-3-2 

2-3-3 
2-3-4 


Identification of Deposit 
Name of depositary institution 

Address of depositary institution 

Date of deposit 
Accession Number 


China General Microbiological Culture 
Collection Center 

China Committee for Culture Collection 
of Microorganisms, P.O. Box 2714, 
Beijing 100080, China 
29 May 2001 (29.05.2001) 
CGMCC 05 81 


2-4 


Additional Indications 


NONE 


2-5 


Designated States for Which 
indications are Made 


all designated States 


2-6 


Separate Furnishing of Indications 

These indications will be submitted to 
the Internationa! Bureau later 


NONE 
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3 

3-1 
3-2 


The indications made below relate to 
the deposited microorganism(s) or 
other biological material referred to 
in the description on: 

page 

line 


64 

14-22 


3-3 
3-3-1 

3-3-2 

3-3-3 
3-3-4 


Identification of Deposit 
Name of depositary institution 

Address of depositary institution 

Date of deposit 
Accession Number 


China General Microbiological Culture 
Collection Center 

China Committee for Culture Collection 
of Microorganisms, P.O. Box 2714, 
Beijing 10 0 0 80, China 
29 May 2001 (29.05.2001) 
CGMCC 05 85 


3-4 


Additional Indications 


NONE 


3-5 


Designated States for Which 
Indications are Made 


all designated States 


3-6 


Separate Furnishing of Indications 

These indications will be submitted to 
the international Bureau later 


NONE 


4 

4-1 
4-2 


The indications made below relate to 
the deposited microorganism(s) or 
other biological material referred to 
in the description on: 

page 

line 


64 

24-32 


4-3 

4-3-1 

4-3-2 

4-3-3 
4-3-4 


Identification of Deposit 
Name of depositary institution 

Address of depositary institution 

Date of deposit 
Accession Number 


China General Microbiological Culture 
Collection Center 

China Committee for Culture Collection 
of Microorganisms, P.O. Box 2714, 
Beijing 100080, China 
29 May 2001 (29.05.2001) 
CGMCC 05 82 


4-4 


Additional Indications 


NONE 


4-5 


Designated States for Which 
Indications are Made 


all designated States 


4-6 


Separate Furnishing of Indications 

These indications will be submitted to 
the international Bureau later 


NONE 


5 

5-1 
5-2 


The indications made below relate to 
the deposited microorganism(s) or 
other biological material referred to 
in the description on: 
page 

line 


64-65 
34-5 
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PCT/DK02/00429 



5-3 


Identification of Deposit 




5-3-1 


Name of depositary institution 


China General Microbiological Culture 
Collection Center 


C O 1 

5-3-2 


AQoress ot oepostiary tnsiiiution 


China Committee for Culture Collection 
of Microorganisms, P.O. Box 2714, 
Beijing 100080, China 


5-3-3 


Date of deposit 


29 May 2001 (29. 05,2001) 


5-3-4 


Accession Number 


CGMCC 0583 


5-4 


Additional Indications 


NONE 


5-5 


Designated States for Which 
Indications are Made 


all designated States 


5-6 

• 


Separate Furnishing of Indications 

\ nese inuicaiions win dc suurnineQ 10 
the International Bureau later 


IN WIN Hi 

• 


6 


The indications made below relate to 
me oeposiieu rnicrouryanisrn^s/ or 
other biological material referred to 
in the description on: 






page 


65 


6-2 


line 


7-15 


6-3 


Identification of Deposit 




6-3-1 


Name of depositary institution 


China General Microbiological Culture 
Collection Center 




Aooress ot aeposiiary insuiution 

■ 


China Committee for Culture Collection 
or Microorganisms, P.O. Box 2714, 
Beijing 100080, China 


6-3-3 


Date of deposit 


29 May 2001 (29.05.2001) 


6-3-4 


Accession Number 


CGMCC 0 5 80 


6-4 


Additional Indications 


NONE 


6-5 


Designated States for Which 
Indications are Made 


all designated States 


6-6 


Separate Furnishing of Indications 

ThPQP inriir*atinnQ will hp» ^iihmift&H to 

1 1 ICoC IllUlbOUUHS Will UC oUUIIIIUCU IU 

the International Bureau later 


INUIMIj 


7 


The indications made below relate to 
ine ueposiiBu microorganismisj or 
other biological material referred to 
in the description on: 




7-1 




65 


7-2 


fine 


17-23 


7-3 


Identification of Deposit 




7-3-1 


Name of depositary institution 


Centre General Chinois de Cultures 
Microbiologiques 


( ~0~£. 


Aooress ot aeposiiary insuiuiion 


Chine - Comite pour la collection de 
cuJLtures de micro - organismes , P.O. Box 
2714, Beijing 100080 


7-3-3 


Date of deposit 


07 June 2002 (07.06.2002) 


7-3-4 


Accession Number 


CGCCM 074 8 


7^ 


Additional Indications 


NONE 
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7-5 



7-6 



8 



8-1 
8-2 



Designated States for Which 
Indications are Made 



Separate Furnishing of Indications 

These indications will be submitted to 
the International Bureau later 



The indications made below relate to 
the deposited microorganism(s) or 
other biological material referred to 
in the description on: 

page 

line 



all designated States 



NONE 



65 

25-31 



8-3 
8-3-1 

8-3-2 



8-3-3 
8-3-4 



Identification of Deposit 
Name of depositary institution 

Address of depositary institution 



Date of deposit 
Accession Number 



Centre General Chinois de Culture; 
Microbiologiques 

Chine - Comite pour la collection 

cultures de micro - organismes , P.O 

2714, Beijing 100080 

07 June 2002 (07.06.2002) 

CGCCM 0749 



Box 



8-4 



Additional Indications 



NONE 



8-5 



Designated States for Which 
Indications are Made 



all designated States 



8-6 



Separate Furnishing of Indications 

These indications will be submitted to 
the International Bureau later 



NONE 



9-1 
9-2 



The indications made below relate to 
the deposited microorganism(s) or 
other biological material referred to 
in the description on: 

page 

line 



65-66 
33-2 



9-3 

9-3-1 

9-3-2 



9-3-3 
9-3-4 



Identification of Deposit 
Name of depositary institution 

Address of depositary institution 



Date of deposit 
Accession Number 



Centre General Chinois de Cultures 
Microbiologiques 

Chine - Comite pour la collection de 

cultures de micro-organismes, P.O. Box 

2714, Beijing 100080 

07 June 2002 (07.06.2002) 

CGCCM 0747 



9-4 



Additional Indications 



NONE 



9-5 



Designated States for Which 
Indications are Made 



all designated States 



9-6 



Separate Furnishing of Indications 

These indications will be submitted to 
the International Bureau later 



NONE 



10 



10-1 
10-2 



The indications made below relate to 
the deposited microorganism(s) or 
other biological material referred to 
in the description on: 
page 

line 



66 

4-10 
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10-3 


identification of Deposit 




10-3-1 


Name of depositary institution 


Centre General Chinois de Cultures 






Microbiologiques 


1 0-3-2 


Address of depositary institution 


Chine - Comite pour la collection de 






cultures de micro-organismes, P.O. Box 












07 June 2002 (07.06.2002) 


10-3-4 


Accession Number 


CGCCM 07 5 0 


10-4 


Additional Indications 




10-5 


Designated States for Which 


all designated States 




Indications are Made 


10-6 


Separate Furnishing of Indications 


NONE 




These indications will be submitted to 


- 




the international Bureau later 




11 


The indications made below relate to 






the deposited microorganism(s) or 






other bioloaical material referred to 






in the description on: 


• 


11-1 


page 


66 


11-2 


iine 


18-26 


11-3 


Identification of Deposit 




11-3-1 


Name of depositary institution 


Centraalbureau voor Schimmel cultures 


11-3-2 


Address of depositary institution 


Uppsalalaan 8, NL-3584 CT Utrecht, The 


■ 




Netherlands / P.O. Box 85167, NL-3508 AD 




■ 


utrecnc/ jl ne xsie unerxancs 






01 June 2001 (01.06.2001) 


11-3-4 


Accession Number 


CBS 109513 


11-4 


Additional Indications 


JNvJJNlL 


11-5 


Designated States for Which 


all designated States 




Indications are Made 


11-6 


Separate Furnishing of Indications 


NONE 




These indications will be submitted to 






the Internationa! Bureau later 




12 


The indications made below relate to 






the deposited microorganism(s) or 






other bioloaical material referred to 






in the description on: 




12-1 


page 1 


66-67 


12-2 


line 


37-7 


12-3 


Identification of Deposit 




12-3-1 


Name of depositary institution 


Centraalbureau voor Schimmelcultures 


1 2-3-2 


Address of depositary institution 


Uppsalalaan 8, NL-3584 CT Utrecht, The 






Netherlands / P.O. Box 85167, NL-3508 AD 






Utrecht- The Netherlanrfe 


12-3-3 


Date of deposit 


23 January 1997 (23.01.1997) 


12-3-4 


Accession Number 


CBS 427.97 


12-4 


Additional Indications 


NONE 


12-5 


Designated States for Which 


all designated States 




Indications are Made 
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12-6 | Separate Furnishing of Indications 

These indications will be submitted to 
the International Bureau later 



13 



13-1 
13-2 



The indications made below relate to 
the deposited microorganism(s) or 
other biological material referred to 
in the description on: 
page 

line 



NONE 
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9-19 



13-3 
13-3-1 
1 3-3-2 



1 3-3-3 
13-3-4 



13-4 



13-5 
13-6 



14 



14-1 
14-2 
14-3 
14-3-1 
14-3-2 



14-3-3 
14-3-4 



Identification of Deposit 
Name of depositary institution 
Address of depositary institution 



14-4 



Date of deposit 
Accession Number 



Centraalbureau voor Schimmelcultures 
Uppsalalaan 8, NL-3584 CT Utrecht, The 
Netherlands / P.O. Box 85167, NL-3508 AD 
Utrecht, The Netherlands 
12 March 1996 (12.03.1996) 
CBS 247.96 



Additional Indications 



NONE 



Designated States for Which 

Indications are Made " 

Separate Furnishing of Indications 

These indications will be submitted to 

the International Bureau later 

The indications made below relate to 
the deposited microorganism(s) or 
other biological material referred to 
in the description on: 
page 

line 

Identification of Deposit 
Name of depositary institution 
Address of depositary institution 



all designated States 
NONE 



Date of deposit 
Accession Number 



Additional Indications 



14 5 I Designated States for Which 
Indications are Made 

14- 6 I Separate Furnishing of Indications 

These indications will be submitted to 
the International Bureau later 

15 I The indications made below relate to 
the deposited microorganism(s) or 
other biological material referred to 
in the description on: 

15- 1 I page 

1 5-2 I line 



67 

30-37 

Centraalbureau voor Schimmelcultures 
Uppsalalaan 8, NL-3584 CT Utrecht, The 
Netherlands / P.O. Box 85167, NL-3508 AD 
Utrecht, The Netherlands 
29 May 2001 (29.05.2001) 
CBS 109471 



NONE 



all designated States 
NONE 



68 
2-9 
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15-3 

15-3-1 

15-3-2 

■ 

15-3-3 
15-3-4 


Identification of Deposit 
Name of depositary institution 
Address of depositary institution 

Date of deposit 
Accession Number 


Centraalbureau voor Schimmelcul tures 
Uppsalalaan 8, NL-3 5 84 CT Utrecht, The 
Netherlands / P.O. Box 85167, NL-3508 AD 
utrecnt, The Netherlands 
04 July 1995 (04.07.1995) 
CBS 521.95 


15-4 


Additional Indications 


NONE 


15-5 


Designated States for Which 
Indications are Made 


all designated States 


15-6 


Separate Furnishing of Indications 

These indications will be submitted to 
the International Bureau later 


NONE 


16 

16-1 
16-2 


The indications made below relate to 
the deposited microorganism(s) or 
other biological material referred to 
in the description on: 
page 

line 


68 

26-36 


16-3 

16-3-1 
16-3-2 

16-3-3 
16-3-4 


Identification of Deposit 
Name of depositary institution 
Address of depositary institution 

• 

Date of deposit 
Accession Number 


Centraalbureau voor Schimmelcultures 
Uppsalalaan 8, NL-3584 CT Utrecht, The 
Netherlands / P.O. Box 85167, NL-3508 AD 
utrecnt/ The Netherlands 
12 March 1996 (12.03.1996) 
CBS 2 84.9 6 


16-4 


Additional Indications 


NONE 


16-5 


Designated States for Which 
Indications are Made 


all designated States 


16-6 


Separate Furnishing of Indications 

These indications will be submitted to 
the International Bureau later 


NONE 


17 

17-1 
17-2 


The indications made below relate to 
the deposited microorganism (s) or 
other biological material referred to 
in the description on: 
page 

line 


68 

11-24 


17-3 

17-3-1 

17-3-2 

17-3-3 
17-3-4 


Identification of Deposit 
Name of depositary institution 
Address of depositary institution 

Date of deposit 
Accession Number 


Centraalbureau voor Schimmelcultures 
Uppsalalaan 8, NL-3 5 84 CT Utrecht, The 
Netherlands / P.O. Box 85167, NL-3508 AD 

TTt" C Vi f~ T 1 Vi O TJ*a ^ Vi o «r* T ari^e 
w L- J_ C^UL , X lit? X* KS JL JL dllLLD 

12 March 1996 (12.03.1996) 
CBS 277.96 


17-4 


Additional Indications 


NONE 


17-5 


Designated States for Which 
Indications are Made 


all designated States 
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17-6 



Separate Furnishing of Indications 

These indications will be submitted to 
the International Bureau later 



NONE 



18 



18-1 
18-2 



The indications made below relate to 
the deposited microorganism(s) or 
other biological material referred to 
in the description on: 
page 

line 



69 

15-23 



18-3 
18-3-1 

18-3-2 

1 8-3-3 
18-3-4 



Identification of Deposit 
Name of depositary institution 

Address of depositary institution 

Date of deposit 
Accession Number 



DSMZ -Deutsche Sammlung von 

Mikroorganisniexi und Zellkulturen GmbH 

Mascheroder Weg lb, D- 3 8124 

Braunschweig, Germany 

13 June 2001 (13.06.2001) 

DSMZ 14348 



18-4 



Additional Indications 



NONE 



18-5 



Designated States for Which 
Indications are Made 



all designated States 



18-6 



Separate Furnishing of Indications 

These indications will be submitted to 
the International Bureau later 



NONE 



19 



19-1 
19-2 



The indications made below relate to 
the deposited microorganism(s) or 
other biological material referred to 
in the description on: 
page 

line 



69 

25-33 



19-3 
19-3-1 

19-3-2 

19-3-3 
19-3-4 



Identification of Deposit 
Name of depositary institution 

Address of depositary institution 

Date of deposit 
Accession Number 



DSMZ -Deutsche Sammlung von 

Mikroorganismen und Zellkulturen GmbH 

Mascheroder Weg lb, D-38124 

Braunschweig , Germany 

21 June 2002 (21,06. 2002) 

DSMZ 15066 



19-4 



Additional Indications 



NONE 



19-5 



Designated States for Which 
Indications are Made 



all designated States 



19-6 



Separate Furnishing of Indications 

These indications will be submitted to 
the International Bureau later 



NONE 



20 



20-1 
20-2 



The indications made below relate to 
the deposited microorganism(s) or 
other biological material referred to 
in the description on: 

page 
line 



69-70 
35-6 
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20-3 Identification of Deposit 

20-3-1 Name of depositary institution 

20-3-2 Address of depositary institution 

20-3-3 Date of deposit 

20-3-4 Accession Number 



20-4 



Additional Indications 



DSMZ -Deutsche Sainmlung von 

Mikroorganismen und Zellkulturen GmbH 

Mascheroder Weg lb, D- 3 8124 

Braunschweig, Germany 

21 June 2002 (21.06.2002) 

DSMZ 15065 



NONE 



20-5 Designated States for Which 
Indications are Made 



all designated States 



20-6 Separate Furnishing of Indications 

These indications will be submitted to 
the International Bureau later 



21 



21-1 
21-2 



NONE 



The indications made beiow relate to 
the deposited microorganism(s) or 
other biological material referred to 
in the description on: 
page 

line 



70 

8-16 



21-3 Identification of Deposit 

213 1 Name of depositary institution 

21-3-2 Address of depositary institution 

21-3-3 Date of deposit 

21-3-4 Accession Number 



DSMZ -Deutsche Sammlung von 

Mikroorganismen und Zellkulturen GmbH 

Mascheroder Weg lb, D-38124 

Braunschweig, Germany 

21 June 2002 (21.06.2002) 

DSMZ 15064 



21-4 



Additional indications 



NONE 



21 -5 Designated States for Which 
Indications are Made 



all designated States 



21-6 Separate Furnishing of Indications 

These indications will be submitted to 
the International Bureau later 



NONE 



22 



22-1 
22-2 



The indications made below relate to 
the deposited microorganism(s) or 
other biological material referred to 
in the description on: 
page 

line 



70 

18-26 



22-3 Identification of Deposit 

22-3- 1 Name of depositary institution 

22-3-2 Address of depositary institution 

22-3-3 Date of deposit 

22-3-4 Accession Number 



DSMZ -Deutsche Sammlung von 

Mikroorganismen und Zellkulturen GmbH 

Mascheroder Weg lb, D-38124 

Braunschweig, Germany 

21 June 2002 (21.06.2002) 

DSMZ 15067 



22-4 



Additional indications 



NONE 



22-5 Designated States for Which 
Indications are Made 



all designated States 
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1. A polypeptide having cellobiohydrolase I activity, selected from the group consisting of: 

5 (a) a polypeptide comprising an amino acid sequence selected from the group consisting of: 
an amino acid sequence which has at least 80% identity with amino acids 1 to 526 of 
SEQ ID NO:2, 

an amino acid sequence which has at least 80% identity with amino acids 1 to 529 of 
SEQ ID NO:4, 

10 an amino acid sequence which has at least 80% identity with amino acids 1 to 451 of 

SEQ ID NO:6, 

an amino acid sequence which has at least 80% identity with amino acids 1 to 457 of 
SEQ ID NO:8, 

an amino acid sequence which has at least 80% identity with amino acids 1 to 538 of 
15 SEQ ID NO:10, 

an amino acid sequence which has at least 70% identity with amino acids 1 to 415 of 
SEQ ID NO:12, 

an amino acid sequence which has at least 70% identity with amino acids 1 to 447 of 
SEQ ID NO:14, 

20 an amino acid sequence which has at least 80% identity with amino acids 1 to 452 of 

SEQ ID NO:16, 

an amino acid sequence which has at least 80% identity with amino acids 1 to 454 of 
SEQ ID NO:38, 

an amino acid sequence which has at least 80% identity with amino acids 1 to 458 of 
25 SEQ ID NO:40 t 

an amino acid sequence which has at least 80% identity with amino acids 1 to 450 of 
SEQ ID NO:42, 

an amino acid sequence which has at least 80% identity with amino acids 1 to 446 of 
SEQ ID NO:44, 

30 an amino acid sequence which has at least 80% identity with amino acids 1 to 527 of 

SEQ ID NO:46, 

an amino acid sequence which has at least 80% identity with amino acids 1 to 455 of 
SEQ ID NO:48, 

an amino acid sequence which has at least 80% identity with amino acids 1 to 464 of 
35 SEQ ID NO:50, 

an amino acid sequence which has at least 80% identity with amino acids 1 to 460 of 
SEQ ID NO:52, 
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an amino acid sequence which has at least 80% identity with amino acids 1 to 450 of 
SEQ ID NO:54, 

an amino acid sequence which has at least 80% identity with amino acids 1 to 532 of 
SEQ ID NO:56, 

5 an amino acid sequence which has at least 80% identity with amino acids 1 to 460 of 

SEQ ID NO:58, 

an amino acid sequence which has at least 80% identity with amino acids 1 to 525 of 
SEQ ID NO:60, and 

an amino acid sequence which has at least 80% identity with amino acids 1 to 456 of 
10 SEQ ID NO:66; 



(b) a polypeptide comprising an amino acid sequence selected from the group consisting of: 
an amino acid sequence which has at least 80% identity with the polypeptide encoded by 
the cellobiohydrolase I encoding part of the nucleotide sequence present in Acremonium 
15 thermophilum, 

an amino acid sequence which has at least 80% identity with the polypeptide encoded by 
the cellobiohydrolase I encoding part of the nucleotide sequence present in Chaetomium 
thermophilum, 

an amino acid sequence which has at least 80% identity with the polypeptide encoded by 
20 the cellobiohydrolase I encoding part of the nucleotide sequence present in Scytalidium 

sp., 

an amino acid sequence which has at least 80% identity with the polypeptide encoded by 
the cellobiohydrolase I encoding part of the nucleotide sequence present in Scytalidium 
thermophilum, 

25 an amino acid sequence which has at least 80% identity with the polypeptide encoded by 

the cellobiohydrolase I encoding part of the nucleotide sequence present in 
Thermoascus aurantiacus, 

an amino acid sequence which has at least 80% identity with the polypeptide encoded by 
the cellobiohydrolase I encoding part of the nucleotide sequence present in Thielavia 
30 australiensis, 

an amino acid sequence which has at least 70% identity with the polypeptide encoded by 
the cellobiohydrolase I encoding part of the nucleotide sequence present in Verticillium 
tenerum, 

an amino acid sequence which has at least 70% identity with the polypeptide encoded by 
35 the cellobiohydrolase I encoding part of the nucleotide sequence present in Neotermes 

castaneus, 

an amino acid sequence which has at least 80% identity with the polypeptide encoded by 
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the cellobiohydrolase 1 encoding part of the nucleotide sequence present in 
Melanocarpus albomyces, 

an amino acid sequence which has at least 80% identity with the polypeptide encoded 
by the cellobiohydrolase I encoding part of the nucleotide sequence present in 
Acremonium sp., 

an amino acid sequence which has at least 80% identity with the polypeptide encoded by 
the cellobiohydrolase I encoding part of the nucleotide sequence present in 
Chaetomidium pingtungium, 

an amino acid sequence which has at least 80% identity with the polypeptide encoded by 
the cellobiohydrolase I encoding part of the nucleotide sequence present in 
Sporotrichum pruinosum, 

an amino acid sequence which has at least 80% identity with the polypeptide encoded by 
the cellobiohydrolase I encoding part of the nucleotide sequence present in Diplodia 
gossypina, 

an amino acid sequence which has at least 80% identity with the polypeptide encoded by 
the cellobiohydrolase I encoding part of the nucleotide sequence present in Trichophaea 
saccata, 

an amino acid sequence which has at least 80% identity with the polypeptide encoded by 
the cellobiohydrolase I encoding part of the nucleotide sequence present in 
Myceliophthora thermophila, 

an amino acid sequence which has at least 80% identity with the polypeptide encoded by 
the cellobiohydrolase I encoding part of the nucleotide sequence present in Exidia 
glandulosa, 

an amino acid sequence which has at least 80% identity with the polypeptide encoded by 
the cellobiohydrolase I encoding part of the nucleotide sequence present in Xylaria 
hy poxy Ion , 

an amino acid sequence which has at least 80% identity with the polypeptide encoded by 
the cellobiohydrolase I encoding part of the nucleotide sequence present in Poitrasia 
circinans, 

an amino acid sequence which has at least 80% identity with the polypeptide encoded by 
the cellobiohydrolase I encoding part of the nucleotide sequence present in Coprinus 
cinereus, 

an amino acid sequence which has at least 80% identity with the polypeptide encoded by 
the cellobiohydrolase I encoding part of the nucleotide sequence present in 
Pseudoplectania nigrella, 

an amino acid sequence encoded by the cellobiohydrolase I encoding part of the 
nucleotide sequence present in Trichothecium roseum IFO 5372, 
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an amino acid sequence encoded by the cellobiohydrolase I encoding part of the 

nucleotide sequence present in Hum/cola nigrescens CBS 819.73, 

an amino acid sequence encoded by the cellobiohydrolase I encoding part of the 

nucleotide sequence present in Cladorrhinum foecundissimum CBS 427.97, 

an amino acid sequence encoded by the cellobiohydrolase I encoding part of the 

nucleotide sequence present in Diplodia gossypina CBS 247.96, 

an amino acid sequence encoded by the cellobiohydrolase I encoding part of the 

nucleotide sequence present in Myceliophthora thermophila CBS 1 17.65, 

an amino acid sequence encoded by the cellobiohydrolase I encoding part of the 

nucleotide sequence present in Rhizomucor pusillus CBS 109471, 

an amino acid sequence encoded by the cellobiohydrolase I encoding part of the 

nucleotide sequence present in Meripilus giganteus CBS 521.95, 

an amino acid sequence encoded by the cellobiohydrolase I encoding part of the 
nucleotide sequence present in Exidia glandulosa CBS 2377.96, 

an amino acid sequence encoded by the cellobiohydrolase I encoding part of the 
nucleotide sequence present in Xylaria hypoxylon CBS 284.96, 

an amino acid sequence encoded by the cellobiohydrolase I encoding part of the 
nucleotide sequence present in Trichophaea saccata CBS 804.70, 
an amino acid sequence encoded by the cellobiohydrolase I encoding part of the 
nucleotide sequence present in Chaetomium sp., 

an amino acid sequence encoded by the cellobiohydrolase I encoding part of the 
nucleotide sequence present in Myceliophthora hinnulea, 

an amino acid sequence encoded by the cellobiohydrolase I encoding part of the 
nucleotide sequence present in Thielavia cf. microspora, 

an amino acid sequence encoded by the cellobiohydrolase I encoding part of the 
nucleotide sequence present in Aspergillus sp., 

an amino acid sequence encoded by the cellobiohydrolase I encoding part of the 
nucleotide sequence present in Scopulariopsis sp., 

an amino acid sequence encoded by the cellobiohydrolase I encoding part of the 
nucleotide sequence present in Fusarium sp M 

an amino acid sequence encoded by the cellobiohydrolase I encoding part of the 
nucleotide sequence present in Verticillium sp., and 

an amino acid sequence encoded by the cellobiohydrolase I encoding part of the 
nucleotide sequence present in Phytophthora infestans\ 

(c) a polypeptide comprising an amino acid sequence selected from the group consisting of: 
an amino acid sequence which has at least 80% identity with the polypeptide encoded by 
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nucleotides 1 to 1578 of SEQ ID NO:1, 

an amino acid sequence which has at least 80% 

nucleotides 1 to 1587 of SEQ ID NO:3, 

an amino acid sequence which has at least 80% 

nucleotides 1 to 1353 of SEQ ID NO:5, 

an amino acid sequence which has at least 80% 

nucleotides 1 to 1371 of SEQ ID NO:7, 

an amino acid sequence which has at least 80% 

nucleotides 1 to 1614 of SEQ ID NO:9, 

an amino acid sequence which has at least 70% 

nucleotides 1 to 1245 of SEQ ID NO:11, 

an amino acid sequence which has at least 70% 

nucleotides 1 to 1341 of SEQ ID NO: 13, 

an amino acid sequence which has at least 80% 

nucleotides 1 to 1356 of SEQ ID NO:15, 

an amino acid sequence which has at least 80% 

nucleotides 1 to 1365 of SEQ ID NO:37, 

an amino acid sequence which has at least 80% 

nucleotides 1 to 1377 of SEQ ID NO:39, 

an amino acid sequence which has at least 80% 

nucleotides 1 to 1353 of SEQ ID NO:41, 

an amino acid sequence which has at least 80% 

nucleotides 1 to 1341 of SEQ ID NO:43, 

an amino acid sequence which has at least 80% 

nucleotides 1 to 1584 of SEQ ID NO:45, 

an amino acid sequence which has at least 80% 

nucleotides 1 to 1368 of SEQ ID NO:47, 

an amino acid sequence which has at least 80% 

nucleotides 1 to 1395 of SEQ ID NO:49, 

an amino acid sequence which has at least 80% 

nucleotides 1 to 1383 of SEQ ID NO:51, 

an amino acid sequence which has at least 80% 

nucleotides 1 to 1353 of SEQ ID NO:53, 

an amino acid sequence which has at least 80% 

nucleotides 1 to 1599 of SEQ ID NO:55, 

an amino acid sequence which has at least 80% 

nucleotides 1 to 1383 of SEQ ID NO:57, 
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85 



WO 03/000941 PCT/DK02/00429 

an amino acid sequence which has at least 80% identity with the polypeptide encoded by 
nucleotides 1 to 1578 of SEQ ID NO:59, and 

an amino acid sequence which has at least 80% identity with the polypeptide encoded by 
nucleotides 1 to 1371 of SEQ ID NO:65; 



30 



35 



(d) a polypeptide which is encoded by a nucleotide sequence which hybridizes under high 
stringency conditions with a polynucleotide probe selected from the group consisting of: 
(i) the complementary strand of the nucleotides selected from the group consisting of: 





nucleotides ' 


1 to 


1578 


of 


SEQ 


ID 


NO:1, 


10 


nucleotides ' 


1 to 


1587 


of 


SEQ 


ID 


NO:3, 




nucleotides ' 


1 to 


1353 


of 


SEQ 


ID 


NO:5, 




nucleotides ' 


1 to 


1371 


of 


SEQ 


ID 


NO:7, 




nucleotides ' 


1 to 


1614 


of 


SEQ 


ID 


NO:9, 




nucleotides ' 


1 to 


1245 


of 


SEQ 


ID 


NO: 11 


15 


nucleotides ' 


1 to 


1341 


of 


SEQ 


ID 


NO:13 




nucleotides ' 


1 to 


1356 


of 


SEQ 


ID 


NO:15 




nucleotides ' 


1 to 


1365 


of 


SEQ 


ID 


NO:37 




nucleotides ' 


1 to 


1377 


of 


SEQ 


ID 


NO:39 




nucleotides ' 


1 to 


1353 


of 


SEQ 


ID 


NO:41 


20 


nucleotides ' 


1 to 


1341 


of 


SEQ 


ID 


NO:43 




nucleotides ' 


1 to 


1584 


of 


SEQ 


ID 


NO:45 




nucleotides ' 


1 to 


1368 


of 


SEQ 


ID 


NO:47 




nucleotides ' 


1 to 


1395 


of 


SEQ 


ID 


NO:49 




nucleotides ' 


1 to 


1383 


of 


SEQ 


ID 


NO:51 


25 


nucleotides ' 


1 to 


1353 


of 


SEQ 


ID 


NO:53 




nucleotides ' 


1 to 


1599 


of 


SEQ 


ID 


NO:55 




nucleotides ' 


1 to 


1383 


of 


SEQ 


ID 


NO: 57 




nucleotides ' 


1 to 


1578 


of 


SEQ 


ID 


NO:59 




nucleotides ' 


1 to 


1371 


of 


SEQ 


ID 


NO:65 



(ii) the comple 
nucleotides 
nucleotides 
nucleotides 
nucleotides 
nucleotides 
nucleotides 
nucleotides 



and 



mentary strand of the nucleotides selected from the group consisting of: 
to 500 of SEQ IDNO:1, 
to 500 of SEQ IDNO:3, 
to 500 of SEQIDNO:5, 
to 500 ofSEQIDNO:7, 
to 500 of SEQ ID NO:9, 
to 500 of SEQ ID NO:11, 
to 500 of SEQ ID NO: 13, 
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nucleotides ' 


1 to 


500 


of 


SEQ 


ID 


NO: 15, 




nucleotides ' 


1 to 


500 


of 


SEQ 


ID 


NO:37, 




nucleotides ' 


1 to 


500 


of 


SEQ 


ID 


NO:39, 




nucleotides ' 


1 to 


500 


of 


SEQ 


ID 


NO:41, 


5 


nucleotides ' 


1 to 


500 


of 


SEQ 


ID 


NO:43, 




nucleotides ' 


t to 


500 


of 


SEQ 


ID 


NO:45, 




nucleotides ' 


1 to 


500 


of 


SEQ 


ID 


NO:47, 




nucleotides ' 


1 to 


500 


of 


SEQ 


ID 


NO:49, 




nucleotides ' 


1 to 


500 


of 


SEQ 


ID 


NO:51, 


10 


nucleotides ' 


1 to 


500 


of 


SEQ 


ID 


NO:53, 




nucleotides ' 


1 to 


500 


of 


SEQ 


ID 


NO:55, 




nucleotides ' 


1 to 


500 


of 


SEQ 


ID 


NO:57, 




nucleotides ' 


1 to 


500 


of 


SEQ 


ID 


NO:59, 




nucleotides ' 


1 to 


500 


of 


SEQ 


ID 


NO:65, 


15 


nucleotides ' 


1 to 


221 


of 


SEQ 


ID 


NO:17, 




nucleotides ' 


1 to 


239 


of 


SEQ 


ID 


NO:18, 




nucleotides ' 


1 to 


199 


of 


SEQ 


ID 


NO: 19, 




nucleotides ' 


1 to 


191 


of 


SEQ 


ID 


NO:20, 




nucleotides ' 


1 to 


232 


of 


SEQ 


ID 


NO:21, 


20 


nucleotides ' 


1 to 


467 


of 


SEQ 


ID 


NO:22, 




nucleotides ' 


1 to 


534 


of 


SEQ 


ID 


NO:23, 




nucleotides ' 


1 to 


563 


of 


SEQ 


ID 


NO:24, 




nucleotides 1 


1 to 


218 


of 


SEQ 


ID 


NO:25, 




nucleotides 1 


1 to 


492 


of 


SEQ 


ID 


NO:26, 


25 


nucleotides ' 


1 to 


481 


of 


SEQ 


ID 


NO:27, 




nucleotides 1 


1 to 


463 


of 


SEQ 


ID 


NO:28, 




nucleotides ' 


1 to 


513 


of 


SEQ 


ID 


NO:29, 




nucleotides 1 


1 to 


579 


of 


SEQ 


ID 


NO:30, 




nucleotides 1 


1 to 


514 


of 


SEQ 


ID 


NO:31, 


30 


nucleotides 1 


1 to 


477 


of 


SEQ 


ID 


NO:32, 




nucleotides 1 


1 to 


500 


of 


SEQ 


ID 


NO:33, 




nucleotides 1 


1 to 


470 


of 


SEQ 


ID 


NO:34, 




nucleotides 1 


1 to 


491 


of 


SEQ 


ID 


NO:35, 




nucleotides 1 


1 to 


221 


of 


SEQ 


ID 


NO:36, 


35 


nucleotides 1 


1 to 


519 


of 


SEQ 


ID 


NO:61, 




nucleotides 1 


1 to 


497 


of 


SEQ 


ID 


NO:62, 




nucleotides 1 


1 to 


498 


of 


SEQ 


ID 


NO:63, 
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(iii) the complementary strand of the nucleotides selected from the group consisting of: 



to 525 of SEQ ID NO:64, and 
to 951 of SEQ ID NO:67; and 



nucleotides ' 


I to 


200 


of 


SEQ 


ID 


NO:1, 


nucleotides ' 


I to 


200 


of 


SEQ 


ID 


NO:3, 


nucleotides ' 


I to 


200 


of 


SEQ 


ID 


NO:5, 


nucleotides ' 


I to 


200 


of 


SEQ 


ID 


NO:7, 


nucleotides ' 


I to 


200 


of 


SEQ 


ID 


NO:9, 


nucleotides ' 


I to 


200 


of 


SEQ 


ID 


NO:11 


nucleotides ' 


I to 


200 


of 


SEQ 


ID 


NO:13 


nucleotides ' 


I to 


200 


of 


SEQ 


ID 


NO:15 


nucleotides ' 


I to 


200 


of 


SEQ 


ID 


NO:37 


nucleotides ' 


I to 


200 


of 


SEQ 


ID 


NO:39 


nucleotides ' 


I to 


200 


of 


SEQ 


ID 


NO:41 


nucleotides 4 


I to 


200 


of 


SEQ 


ID 


NO:43 


nucleotides ' 


I to 


200 


of 


SEQ 


ID 


NO:45 


nucleotides ' 


I to 


200 


of 


SEQ 


ID 


NO:47 


nucleotides ' 


I to 


200 


of 


SEQ 


ID 


NO:49 


nucleotides ' 


I to 


200 


of 


SEQ 


ID 


NO:51 


nucleotides ' 


I to 


200 


of 


SEQ 


ID 


NO:53 


nucleotides ' 


I to 


200 


of 


SEQ 


ID 


NO:55 


nucleotides ' 


I to 


200 


of 


SEQ 


ID 


NO:57 


nucleotides ' 


I to 


200 


of 


SEQ 


ID 


NO:59 


nucleotides ' 


I to 


200 


of 


SEQ 


ID 


NO:65 



and 
and 



(e) a fragment of (a), (b) or (c) that has cellobiohydrolase I activity. 



2. The polypeptide according to claim 1, comprising an amino acid sequence selected from 
the group consisting of: 

30 an amino acid sequence which has at least 85% identity, preferably at least 90% identity, more 
preferably at least 95% identity, with amino acids 1 to 526 of SEQ ID NO:2; 
an amino acid sequence which has at least 85% identity, preferably at least 90% identity, more 
preferably at least 95% identity, with amino acids 1 to 529 of SEQ ID NO:4, 
an amino acid sequence which has at least 85% identity, preferably at least 90% identity, more 

35 preferably at least 95% identity, with amino acids 1 to 451 of SEQ ID NO:6, 

an amino acid sequence which has at least 85% identity, preferably at least 90% identity, more 
preferably at least 95% identity, with amino acids 1 to 457 of SEQ ID NO:8, 
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an amino acid sequence which has at least 85% identity, preferably at least 90% identity, more 
preferably at least 95% identity, with amino acids 1 to 538 of SEQ ID NO: 10, 
an amino acid sequence which has at least 75% identity, preferably at least 80% identity, more 
preferably at least 90% identity, with amino acids 1 to 415 of SEQ ID NO: 12, 
5 an amino acid sequence which has at least 75% identity, preferably at least 80% identity, more 
preferably at least 90% identity, with amino acids 1 to 447 of SEQ ID NO: 14, 
an amino acid sequence which has at least 85% identity, preferably at least 90% identity, more 
preferably at least 95% identity, with amino acids 1 to 452 of SEQ ID NO: 16, 
an amino acid sequence which has at least 85% identity, preferably at least 90% identity, more 

10 preferably at least 95% identity with amino acids 1 to 454 of SEQ ID NO:38, 

an amino acid sequence which has at least 85% identity, preferably at least 90% identity, more 
preferably at least 95% identity with amino acids 1 to 458 of SEQ ID NO:40, 
an amino acid sequence which has at least 85% identity, preferably at least 90% identity, more 
preferably at least 95% identity with amino acids 1 to 450 of SEQ ID NO:42, 

15 an amino acid sequence which has at least 85% identity, preferably at least 90% identity, more 
preferably at least 95% identity with amino acids 1 to 446 of SEQ ID NO:44, 
an amino acid sequence which has at least 85% identity, preferably at least 90% identity, more 
preferably at least 95% identity with amino acids 1 to 527 of SEQ ID NO:46, 
an amino acid sequence which has at least 85% identity, preferably at least 90% identity, more 

20 preferably at least 95% identity with amino acids 1 to 455 of SEQ ID NO:48, 

an amino acid sequence which has at least 85% identity, preferably at least 90% identity, more 
preferably at least 95% identity with amino acids 1 to 464 of SEQ ID NO: 50, 
an amino acid sequence which has at least 85% identity, preferably at least 90% identity, more 
preferably at least 95% identity with amino acids 1 to 460 of SEQ ID NO:52, 

25 an amino acid sequence which has at least 85% identity, preferably at least 90% identity, more 
preferably at least 95% identity with amino acids 1 to 450 of SEQ ID NO: 54, 
an amino acid sequence which has at least 85% identity, preferably at least 90% identity, more 
preferably at least 95% identity with amino acids 1 to 532 of SEQ ID NO: 56, 
an amino acid sequence which has at least 85% identity, preferably at least 90% identity, more 

30 preferably at least 95% identity with amino acids 1 to 460 of SEQ ID NO: 58, 

an amino acid sequence which has at least 85% identity, preferably at least 90% identity, more 
preferably at least 95% identity with amino acids 1 to 525 of SEQ ID NO:60, and 
an amino acid sequence which has at least 85% identity, preferably at least 90% identity, more 
preferably at least 95% identity with amino acids 1 to 456 of SEQ ID NO:66. 

35 

3. The polypeptide according to any of claims 1-2, which consists of an amino acid sequence 
selected from the group consisting of: 
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4. The polypeptide according to any of claims 1-2, where the polypeptide is an artificial variant 
which comprises an amino acid sequence that has at least one substitution, deletion and/or 
25 insertion of an amino acid as compared to an amino acid sequence selected from the group 
consisting of: 

amino acids 1 to 526 of SEQ ID NO:2, 

amino acids 1 to 529 of SEQ ID NO:4, 

amino acids 1 to 451 of SEQ ID NO:6 t 
30 amino acids 1 to 457 of SEQ ID NO:8, 

amino acids 1 to 538 of SEQ ID NO: 10, 

amino acids 1 to 415 of SEQ ID NO:12, 

amino acids 1 to 447 of SEQ ID NO: 14, 

amino acids 1 to 452 of SEQ ID NO: 16, 
35 amino acids 1 to 454 of SEQ ID NO:38, 

amino acids 1 to 458 of SEQ ID NO:40, 

amino acids 1 to 450 of SEQ ID NO:42, 
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5. The polypeptide according to claim 1, comprising an amino acid sequence selected from 
the group consisting of: 

an amino acid sequence which has at least 80% identity, preferably at least 90% identity, with 

15 the polypeptide encoded by the cellobiohydrolase I encoding part of the nucleotide sequence 
inserted into a plasmid present in the deposited microorganism CGMCC No. 0584, 
an amino acid sequence which has at least 80% identity, preferably at least 90% identity, with 
the polypeptide encoded by the cellobiohydrolase I encoding part of the nucleotide sequence 
inserted into a plasmid present in the deposited microorganism CGMCC No. 0581, 

20 an amino acid sequence which has at least 80% identity, preferably at least 90% identity, with 
the polypeptide encoded by the cellobiohydrolase I encoding part of the nucleotide sequence 
inserted into a plasmid present in the deposited microorganism CGMCC No. 0585, 
an amino acid sequence which has at least 80% identity, preferably at least 90% identity, with 
the polypeptide encoded by the cellobiohydrolase I encoding part of the nucleotide sequence 

25 inserted into a plasmid present in the deposited microorganism CGMCC No. 0582, 

an amino acid sequence which has at least 80% identity, preferably at least 90% identity, with 
the polypeptide encoded by the cellobiohydrolase I encoding part of the nucleotide sequence 
inserted into a plasmid present in the deposited microorganism CGMCC No. 0583, 
an amino acid sequence which has at least 80% identity, preferably at least 90% identity, with 

30 the polypeptide encoded by the cellobiohydrolase I encoding part of the nucleotide sequence 
inserted into a plasmid present in the deposited microorganism CBS 109513, 
an amino acid sequence which has at least 80% identity, preferably at least 90% identity, with 
the polypeptide encoded by the cellobiohydrolase I encoding part of the nucleotide sequence 
inserted into a plasmid present in the deposited microorganism DSM 14348, 

35 an amino acid sequence which has at least 80% identity, preferably at least 90% identity, with 
the polypeptide encoded by the cellobiohydrolase I encoding part of the nucleotide sequence 
inserted into a plasmid present in the deposited microorganism CGMCC No. 0580, 
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an amino acid sequence which has at least 80% identity, preferably at least 90% identity, with 
the polypeptide encoded by the cellobiohydrolase I encoding part of the nucleotide sequence 
inserted into a plasmid present in the deposited microorganism CGMCC No. 0747, 
an amino acid sequence which has at least 80% identity, preferably at least 90% identity, with 
5 the polypeptide encoded by the cellobiohydrolase I encoding part of the nucleotide sequence 
inserted into a plasmid present in the deposited microorganism CGMCC No. 0748, 
an amino acid sequence which has at least 80% identity, preferably at least 90% identity, with 
the polypeptide encoded by the cellobiohydrolase I encoding part of the nucleotide sequence 
inserted into a plasmid present in the deposited microorganism CGMCC No. 0749, 

10 an amino acid sequence which has at least 80% identity, preferably at least 90% identity, with 
the polypeptide encoded by the cellobiohydrolase I encoding part of the nucleotide sequence 
inserted into a plasmid present in the deposited microorganism CGMCC No. 0750, 
an amino acid sequence which has at least 80% identity, preferably at least 90% identity, with 
the polypeptide encoded by the cellobiohydrolase I encoding part of the nucleotide sequence 

15 inserted into a plasmid present in the deposited microorganism DSM 15064, 

an amino acid sequence which has at least 80% identity, preferably at least 90% identity, with 
the polypeptide encoded by the cellobiohydrolase I encoding part of the nucleotide sequence 
inserted into a plasmid present in the deposited microorganism DSM 15065, 
an amino acid sequence which has at least 80% identity, preferably at least 90% identity, with 

20 the polypeptide encoded by the cellobiohydrolase I encoding part of the nucleotide sequence 
inserted into a plasmid present in the deposited microorganism DSM 15066, and 
an amino acid sequence which has at least 80% identity, preferably at least 90% identity, with 
the polypeptide encoded by the cellobiohydrolase I encoding part of the nucleotide sequence 
inserted into a plasmid present in the deposited microorganism DSM 15067. 

25 

6. The polypeptide according to claim 5, which comprises the amino acid sequence encoded 

by the cellobiohydrolase I encoding part of the nucleotide sequence inserted into a plasmid 

present in a deposited microorganism selected from the group consisting of: 

CGMCC No. 0584, 
30 CGMCC No. 0581, 

CGMCC No. 0585, 

CGMCC No. 0582, 

CGMCC No. 0583, 

CBS 109513, 
35 DSM 14348, 

CGMCC No. 0580, 

CGMCC No. 0747, 
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CGMCC No. 0748, 
CGMCC No. 0749, 
CGMCC No. 0750, 
DSM 15064, 
5 DSM 15065, 

DSM 15066, and 
DSM 15067. 

7. The polypeptide according to claims 5 or 6, which consists of the amino acid sequence 
encoded by the cellobiohydrolase I encoding part of the nucleotide sequence inserted into a 
plasmid present in a deposited microorganism selected from the group consisting of: 
CGMCC No. 0584, 
CGMCC No. 0581, 
CGMCC No. 0585, 
CGMCC No. 0582, 
CGMCC No. 0583, 
CBS 109513, 
DSM 14348, 
CGMCC No. 0580, 
CGMCC No. 0747, 
CGMCC No. 0748, 
CGMCC No. 0749, 
CGMCC No. 0750, 
DSM 15064, 
DSM 15065, 
DSM 15066, and 
DSM 15067. 

8. The polypeptide according to claims 5 or 6, where the polypeptide is an artificial variant 
30 which comprises an amino acid sequence that has at least one substitution, deletion and/or 

insertion of an amino acid as compared to the amino acid sequence encoded by the 
cellobiohydrolase I encoding part of the nucleotide sequence inserted into a plasmid present in 
a deposited microorganism selected from the group consisting of: 
CGMCC No. 0584, 
35 CGMCC No. 0581, 
CGMCC No. 0585, 
CGMCC No. 0582, 
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CGMCC No. 0583, 
CBS 109513, 
DSM 14348, 
CGMCC No. 0580, 
5 CGMCC No. 0747, 
CGMCC No. 0748, 
CGMCC No. 0749, 
CGMCC No. 0750, 
DSM 15064, 
10 DSM 15065, 

DSM 15066, and 
DSM 15067, 

9. A polynucleotide having a nucleotide sequence which encodes for the polypeptide defined 
15 in any of claims 1-8. 

10. A nucleic acid construct comprising the nucleotide sequence defined in claim 9 operably 
linked to one or more control sequences that direct the production of the polypeptide in a 
suitable host. 

20 

1 1. A recombinant expression vector comprising the nucleic acid construct defined in claim 10. 

12. A recombinant host cell comprising the nucleic acid construct defined in claim 11. 

25 13. A method for producing a polypeptide as defined in any of claims 1-8, the method 
comprising: 

(a) cultivating a strain, which in its wild-type form is capable of producing the polypeptide, to 
produce the polypeptide; and 

(b) recovering the polypeptide. 

30 

14. A method for producing a polypeptide as defined in any of claims 1-8, the method 
comprising: 

(a) cultivating a recombinant host cell as defined in claim 12 under conditions conducive for 
production of the polypeptide; and 
35 (b) recovering the polypeptide. 

15. A method for in-situ production of a polypeptide as defined in any of claims 1-8, the 
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method comprising: 

(a) cultivating a recombinant host cell as defined in claim 12 under conditions conducive for 
production of the polypeptide; and 

(b) contacting the polypeptide with a desired substrate without prior recovery of the 
5 polypeptide. 



16. A polynucleotide comprising a nucleotii 
a nucleotide sequence which has at least 
NO:1, 

10 a nucleotide sequence which has at least 
NO:3, 

a nucleotide sequence which has at least 
NO:5, 

a nucleotide sequence which has at least 
15 NO:7, 

a nucleotide sequence which has at least 
NO:9, 

a nucleotide sequence which has at least 
NO:11, 

20 a nucleotide sequence which has at least 
NO:13, 

a nucleotide sequence which has at least 
NO:15, 

a nucleotide sequence which has at least 
25 NO:37, 

a nucleotide sequence which has at least 
NO:39, 

a nucleotide sequence which has at least 
NO:41, 

30 a nucleotide sequence which has at least 
NO:43, 

a nucleotide sequence which has at least 
NO:45, 

a nucleotide sequence which has at least 
35 NO:47, 

a nucleotide sequence which has at least 
NO:49, 



ie sequence selected from the group consisting of: 
80% identity with nucleotides 1 to 1578 of SEQ ID 

80% identity with nucleotides 1 to 1587 of SEQ ID 

80% identity with nucleotides 1 to 1353 of SEQ ID 

80% identity with nucleotides 1 to 1371 of SEQ ID 

80% identity with nucleotides 1 to 1614 of SEQ ID 

80% identity with nucleotides 1 to 1245 of SEQ ID 

80% identity with nucleotides 1 to 1341 of SEQ ID 

80% identity with nucleotides 1 to 1356 of SEQ ID 

80% identity with nucleotides 1 to 1365 of SEQ ID 

80% identity with nucleotides 1 to 1377 of SEQ ID 

80% identity with nucleotides 1 to 1353 of SEQ ID 

80% identity with nucleotides 1 to 1341 of SEQ ID 

80% identity with nucleotides 1 to 1584 of SEQ ID 

80% identity with nucleotides 1 to 1368 of SEQ ID 

80% identity with nucleotides 1 to 1395 of SEQ ID 
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a nucleotide sequence which has at least 80% identity with nucleotides 1 to 1383 of SEQ ID 
NO:51, 

a nucleotide sequence which has at least 80% identity with nucleotides 1 to 1353 of SEQ ID 
NO:53, 

a nucleotide sequence which has at least 80% identity with nucleotides 1 to 1599 of SEQ ID 
NO:55, 

a nucleotide sequence which has at least 80% identity with nucleotides 1 to 1383 of SEQ ID 
NO:57, 

a nucleotide sequence which has at least 80% identity with nucleotides 1 to 1578 of SEQ ID 
NO:59, 

a nucleotide sequence which has at least 80% identity with nucleotides 1 to 1371 of SEQ ID 
NO:65, 



a nucleotide sequence which has at least 80% 
NO:1, 

a nucleotide sequence which has at least 80% 
NO:3, 

a nucleotide sequence which has at least 80% 
NO:5, 

a nucleotide sequence which has at least 80% 
NO:7, 

a nucleotide sequence which has at least 80% 
NO:9, 

a nucleotide sequence which has at least 80% 
NO:11, 

a nucleotide sequence which has at least 80% 
NO:13, 

a nucleotide sequence which has at least 80% 
NO:15, 

a nucleotide sequence which has at least 80% 
NO:37, 

a nucleotide sequence which has at least 80% 
NO:39, 

a nucleotide sequence which has at least 80% 
NO:41, 

a nucleotide sequence which has at least 80% 
NO:43, 

a nucleotide sequence which has at least 80% 
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NO:45, 

a nucleotide sequence which has a 
NO:47, 

a nucleotide sequence which has a 
5 NO:49, 

a nucleotide sequence which has a 
NO:51, 

a nucleotide sequence which has a 
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10 a nucleotide sequence which has a 
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a nucleotide sequence which has a 
NO:57, 
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20 a nucleotide sequence which has a 
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NO:23, 

a nucleotide sequence which has a 
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to 221 of SEQ ID 
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to 534 of SEQ ID 



to 563 of SEQ ID 



to 218 of SEQ ID 



to 492 of SEQ ID 
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a nucleotide sequence which has at 
NO:27, 

a nucleotide sequence which has at 
NO:28, 

a nucleotide sequence which has at 
NO:29, 

a nucleotide sequence which has at 
NO:30, 

a nucleotide sequence which has at 
NO:31, 

a nucleotide sequence which has at 
NO:32, 

a nucleotide sequence which has at 
NO:33, 

a nucleotide sequence which has at 
NO:34, 

a nucleotide sequence which has at 
NO:35, 

a nucleotide sequence which has at 
NO:36, 

a nucleotide sequence which has at 
NO:61, 

a nucleotide sequence which has at 
NO:62, 

a nucleotide sequence which has at 
NO:63, 

a nucleotide sequence which has at 
NO:64 t and 

a nucleotide sequence which has at 
NO:67. 



17. A polynucleotide comprising a nuc 
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eotide sequence selected from the group consisting of: 
east 80% identity with the cellobiohydrolase I encoding 
part of the nucleotide sequence inserted into a plasmid present in the deposited 
microorganism CGMCC No. 0584, 

a nucleotide sequence which has at least 80% identity with the cellobiohydrolase I encoding 
part of the nucleotide sequence inserted into a plasmid present in the deposited 
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microorganism CGMCC No. 0581, 

a nucleotide sequence which has at least 80% identity with the cellobiohydrolase I encoding 
part of the nucleotide sequence inserted into a plasmid present in the deposited 
microorganism CGMCC No. 0585, 
5 a nucleotide sequence which has at least 80% identity with the cellobiohydrolase I encoding 
part of the nucleotide sequence inserted into a plasmid present in the deposited 
microorganism CGMCC No. 0582, 

a nucleotide sequence which has at least 80% identity with the cellobiohydrolase I encoding 
part of the nucleotide sequence inserted into a plasmid present in the deposited 
10 microorganism CGMCC No. 0583, 

a nucleotide sequence which has at least 80% identity with the cellobiohydrolase I encoding 
part of the nucleotide sequence inserted into a plasmid present in the deposited 
microorganism CBS 109513, 

a nucleotide sequence which has at least 80% identity with the cellobiohydrolase I encoding 
15 part of the nucleotide sequence inserted into a plasmid present in the deposited 
microorganism DSM 14348, 

a nucleotide sequence which has at least 80% identity with the cellobiohydrolase I encoding 
part of the nucleotide sequence inserted into a plasmid present in the deposited 
microorganism CGMCC No. 0580, 
20 a nucleotide sequence which has at least 80% identity with the cellobiohydrolase I encoding 
part of the nucleotide sequence inserted into a plasmid present in the deposited 
microorganism CGMCC No. 0747, 

a nucleotide sequence which has at least 80% identity with the cellobiohydrolase I encoding 
part of the nucleotide sequence inserted into a plasmid present in the deposited 
25 microorganism CGMCC No. 0748, 

a nucleotide sequence which has at least 80% identity with the cellobiohydrolase I encoding 
part of the nucleotide sequence inserted into a plasmid present in the deposited 
microorganism CGMCC No. 0749, 

a nucleotide sequence which has at least 80% identity with the cellobiohydrolase I encoding 
30 part of the nucleotide sequence inserted into a plasmid present in the deposited 
microorganism CGMCC No. 0750, 

a nucleotide sequence which has at least 80% identity with the cellobiohydrolase I encoding 
part of the nucleotide sequence inserted into a plasmid present in the deposited 
microorganism DSM 15064, 
35 a nucleotide sequence which has at least 80% identity with the cellobiohydrolase I encoding 
part of the nucleotide sequence inserted into a plasmid present in the deposited 
microorganism DSM 15065, 

99 



WO 03/000941 PCT/DK02/00429 

a nucleotide sequence which has at least 80% identity with the cellobiohydrolase I encoding 
part of the nucleotide sequence inserted into a plasmid present in the deposited 
microorganism DSM 15066, and 

a nucleotide sequence which has at least 80% identity with the cellobiohydrolase I encoding 
5 part of the nucleotide sequence inserted into a plasmid present in the deposited 
microorganism DSM 15067. 

18. A polynucleotide comprising a nucleotide sequence selected from the group consisting of: 
a nucleotide sequence which has at least 80% identity with the cellobiohydrolase I encoding 
10 part of the nucleotide sequence present in the microorganism Trichothecium roseum IFO 
5372, 

a nucleotide sequence which has at least 80% identity with the cellobiohydrolase I encoding 
part of the nucleotide sequence present in the microorganism Humicola nigrescens CBS 
819.73, 

15 a nucleotide sequence which has at least 80% identity with the cellobiohydrolase I encoding 
part of the nucleotide sequence present in the microorganism Cladorrhinum foecundissimum 
CBS 427.97, 

a nucleotide sequence which has at least 80% identity with the cellobiohydrolase I encoding 
part of the nucleotide sequence present in the microorganism Diplodia gossypina CBS 247.96, 
20 a nucleotide sequence which has at least 80% identity with the cellobiohydrolase I encoding 
part of the nucleotide sequence present in the microorganism Myceliophthora thermophila 
CBS 117.65, 

a nucleotide sequence which has at least 80% identity with the cellobiohydrolase I encoding 
part of the nucleotide sequence present in the microorganism Rhizomucor pusillus CBS 
25 109471, 

a nucleotide sequence which has at least 80% identity with the cellobiohydrolase I encoding 
part of the nucleotide sequence present in the microorganism Meripilus giganteus CBS 
521.95, 

a nucleotide sequence which has at least 80% identity with the cellobiohydrolase I encoding 
30 part of the nucleotide sequence present in the microorganism Exidia glandulosa CBS 2377.96, 
a nucleotide sequence which has at least 80% identity with the cellobiohydrolase I encoding 
part of the nucleotide sequence present in the microorganism Xylaria hypoxylon CBS 284.96, 
a nucleotide sequence which has at least 80% identity with the cellobiohydrolase I encoding 
part of the nucleotide sequence present in the microorganism Trichophaea saccata CBS 
35 804.70, 

a nucleotide sequence which has at least 80% identity with the cellobiohydrolase I encoding 
part of the nucleotide sequence present in the microorganism Acremonium sp., 
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a nucleotide sequence which has at least 80% identity with the cellobiohydrolase I encoding 
part of the nucleotide sequence present in the microorganism Chaetomium sp., 
a nucleotide sequence which has at least 80% identity with the cellobiohydrolase I encoding 
part of the nucleotide sequence present in the microorganism Chaetomidium pingtungium, 
5 a nucleotide sequence which has at least 80% identity with the cellobiohydrolase I encoding 
part of the nucleotide sequence present in the microorganism Myceliophthora thermophila, 
a nucleotide sequence which has at least 80% identity with the cellobiohydrolase I encoding 
part of the nucleotide sequence present in the microorganism Myceliophthora hinnulea, 
a nucleotide sequence which has at least 80% identity with the cellobiohydrolase I encoding 

10 part of the nucleotide sequence present in the microorganism Sporotrichum pruinosum, 

a nucleotide sequence which has at least 80% identity with the cellobiohydrolase I encoding 
part of the nucleotide sequence present in the microorganism Thielavia cf. microspora, and 
a nucleotide sequence which has at least 80% identity with the cellobiohydrolase I encoding 
part of the nucleotide sequence present in the microorganism Scytalidium sp., 

15 a nucleotide sequence which has at least 80% identity with the cellobiohydrolase I encoding 
part of the nucleotide sequence present in the microorganism Aspergillus sp., 
a nucleotide sequence which has at least 80% identity with the cellobiohydrolase I encoding 
part of the nucleotide sequence present in the microorganism Scopulariopsis sp., 
a nucleotide sequence which has at least 80% identity with the cellobiohydrolase I encoding 

20 part of the nucleotide sequence present in the microorganism Fusarium sp., 

a nucleotide sequence which has at least 80% identity with the cellobiohydrolase I encoding 
part of the nucleotide sequence present in the microorganism Verticilium sp., and 
a nucleotide sequence which has at least 80% identity with the cellobiohydrolase I encoding 
part of the nucleotide sequence present in the microorganism Phytophthora infestans. 



19. A polynucleotide having a nucleotide sequence which encodes a polypeptide having 
cellobiohydrolase I activity, and which hybridizes under high stringency conditions with a 
polynucleotide probe selected from the group consisting of 

(i) the complementary strand of the nucleotides selected from the group consisting of: 



25 



30 



nucleotides 1 to 1578 of SEQ ID NO:1, 



nucleotides 1 to 1587 of SEQ ID NO:3 



35 



nucleotides 1 to 1353 of SEQ ID NO:5, 
nucleotides 1 to 1371 of SEQ ID NO:7, 
nucleotides 1 to 1614 of SEQ ID NO:9, 
nucleotides 1 to 1245 of SEQ ID NO:11, 
nucleotides 1 to 1341 of SEQ ID NO:13, 
nucleotides 1 to 1356 of SEQ ID NO: 15, 
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nucleol 


tides 1 


to 1365 of SEQ ID NO:37, 




nucleol 


tides 1 


to 1377 of SEQ ID NO:39, 




nucleol 


tides 1 


to 1353 of SEQ ID NO:41, 




nucleol 


tides 1 


to 1341 of SEQ ID NO:43, 


5 


nucleol 


tides 1 


to 1584 of SEQ ID NO:45, 




nucleol 


tides 1 


to 1368 of SEQ ID NO:47, 




nucleol 


tides 1 


to 1395 of SEQ ID NO:49, 




nucleol 


tides 1 


to 1383 of SEQ ID NO:51, 




nucleol 


tides 1 


to 1353 of SEQ ID NO:53, 


10 


nucleol 


tides 1 


to 1599 of SEQ ID NO:55, 




nucleol 


tides 1 


to 1383 of SEQ ID NO:57, 




nucleol 


tides 1 


to 1578 of SEQ ID NO:59, and 




nucleol 


tides 1 


to 1371 of SEQ ID NO:65; 




(ii) the complementary strand of the nucleotides s 


15 


nucleol 


tides 1 


to 500 of SEQ ID NO:1, 




nucleol 


tides 1 


to 500 of SEQ ID NO:3, 




nucleol 


tides 1 


to 500 ofSEQIDNO:5, 




nucleol 


tides 1 


to 500 ofSEQIDNO:7, 




nucleol 


tides 1 


to 500 ofSEQIDNO:9, 


20 


nucleol 


tides 1 


to 500 of SEQ ID NO:11, 




nucleol 


tides 1 


to 500 of SEQ ID NO:13, 




nucleol 


tides 1 


to 500 of SEQ ID NO: 15, 




nucleol 


tides 1 


to 500 of SEQ ID NO:37, 




nucleol 


tides 1 


to 500 of SEQ ID NO:39, 


25 


nucleol 


tides 1 


to 500 of SEQ ID NO:41, 




nucleol 


tides 1 


to 500 of SEQ ID NO:43, 




nucleol 


tides 1 


to 500 of SEQ ID NO:45, 




nucleol 


tides 1 


to 500 of SEQ ID NO:47, 




nucleol 


tides 1 


to 500 of SEQ ID NO:49, 


30 


nucleol 


tides 1 


to 500 of SEQ ID NO:51, 




nucleol 


tides 1 


to 500 of SEQ ID NO:53, 




nucleol 


tides 1 


to 500 of SEQ ID NO:55, 




nucleol 


tides 1 


to 500 of SEQ ID NO:57, 




nucleol 


tides 1 


to 500 of SEQ ID NO:59, 


35 


nucleol 


tides 1 


to 500 of SEQ ID NO:65, 




nucleol 


tides 1 


to 221 of SEQ ID NO:17, 




nucleol 


tides 1 


to 239 of SEQ ID NO:18, 
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nucleol 


tides 1 


to 


199 


of 


SEQ 


ID 


NO: 19, 




nucleo! 


tides 1 


to 


191 


of 


SEQ 


ID 


NO:20, 




nucleol 


tides 1 


to 


232 


of 


SEQ 


ID 


NO:21, 




nucleot 


tides 1 


to 


467 


of 


SEQ 


ID 


NO:22, 


5 


nucleol 


tides 1 


to 


534 


of 


SEQ 


ID 


NO:23, 




nucleol 


tides 1 


to 


563 


of 


SEQ 


ID 


NO:24, 




nucleol 


tides 1 


to 


218 


of 


SEQ 


ID 


NO:25, 




nucleol 


tides 1 


to 


492 


of 


SEQ 


ID 


NO:26, 




nucleol 


tides 1 


to 


481 


of 


SEQ 


ID 


NO:27, 


10 


nucleol 


tides 1 


to 


463 


of 


SEQ 


ID 


NO:28, 




nucleol 


tides 1 


to 


513 


of 


SEQ 


ID 


NO:29, 




nucleol 


tides 1 


to 


579 


of 


SEQ 


ID 


NO:30, 




nucleol 


tides 1 


to 


514 


of 


SEQ 


ID 


NO:31, 




nucleot 


tides 1 


to 


477 


of 


SEQ 


ID 


NO:32, 


15 


nucleol 


tides 1 


to 


500 


of 


SEQ 


ID 


NO:33, 




nucleol 


tides 1 


to 


470 


of 


SEQ 


ID 


NO:34, 




nucleol 


tides 1 


to 


491 


of 


SEQ 


ID 


NO:35, 




nucleol 


tides 1 


to 


221 


of 


SEQ 


ID 


NO:36, 




nucleol 


tides 1 


to 


519 


of 


SEQ 


ID 


NO:61, 


20 


nucleol 


tides 1 


to 


497 


of 


SEQ 


ID 


NO:62, 




nucleol 


tides 1 


to 


498 


of 


SEQ 


ID 


NO:63, 




nucleol 


tides 1 


to 


525 


of 


SEQ 


ID 


NO:64, 




nucleol 


tides 1 


to 


951 


of 


SEQ 


ID 


NO:67; 




(iii) the compleme 


ntary strand of the nucle< 


25 


nucleol 


tides 1 


to 


200 


of 


SEQ 


ID 


NO:1, 




nucleol 


tides 1 


to 


200 


of 


SEQ 


ID 


NO:3, 




nucleol 


tides 1 


to 


200 


of 


SEQ 


ID 


NO:5, 




nucleol 


tides 1 


to 


200 


of 


SEQ 


ID 


NO:7, 




nucleol 


tides 1 


to 


200 


of 


SEQ 


ID 


NO:9, 


30 


nucleol 


tides 1 


to 


200 


of 


SEQ 


ID 


NO:11, 




nucleol 


tides 1 


to 


200 


of 


SEQ 


ID 


NO:13, 




nucleol 


tides 1 


to 


200 


of 


SEQ 


ID 


NO:15, 




nucleol 


tides 1 


to 


200 


of 


SEQ 


ID 


NO:37, 




nucleol 


tides 1 


to 


200 


of 


SEQ 


ID 


NO:39, 


35 


nucleol 


tides 1 


to 


200 


of 


SEQ 


ID 


NO:41, 




nucleol 


tides 1 


to 


200 


of 


SEQ 


ID 


NO:43, 




nucleol 


tides 1 


to 


200 


of 


SEQ 


ID 


NO:45, 



and 
and 
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nucleotides 



to 200 of SEQ ID NO:47, 
to 200 of SEQ ID NO:49, 
to 200 of SEQ IDNO:51, 
to 200 of SEQIDNO:53, 
to 200 of SEQ ID NO:55, 
to 200 of SEQ ID NO:57, 



nucleotides 



nucleotides 



nucleotides 



5 



nucleotides 



nucleotides 



nucleotides 



nucleotides 



to 200 of SEQ ID NO:59, and 
to 200 of SEQ ID NO:65. 



10 20. A polynucleotide comprising a modified nucleotide sequence selected from the group 
consisting of: 

the nucleotide sequence of SEQ ID NO:1 comprising at least one modification, where the 
modified nucleotide sequence encodes a polypeptide which consists of amino acids 1 to 526 
of SEQ ID NO:2, 

15 the nucleotide sequence of SEQ ID NO:3 comprising at least one modification, where the 
modified nucleotide sequence encodes a polypeptide which consists of amino acids 1 to 529 
of SEQ ID NO:4, 

the nucleotide sequence of SEQ ID NO:5 comprising at least one modification, where the 
modified nucleotide sequence encodes a polypeptide which consists of amino acids 1 to 451 
20 of SEQ ID NO:6, 

the nucleotide sequence of SEQ ID NO:7 comprising at least one modification, where the 
modified nucleotide sequence encodes a polypeptide which consists of amino acids 1 to 457 
of SEQ ID NO:8, 

the nucleotide sequence of SEQ ID NO:9 comprising at least one modification, where the 
25 modified nucleotide sequence encodes a polypeptide which consists of amino acids 1 to 538 
of SEQ ID NO: 10, 

the nucleotide sequence of SEQ ID NO: 11 comprising at least one modification, where the 
modified nucleotide sequence encodes a polypeptide which consists of amino acids 1 to 415 
of SEQ ID NO:12, 

30 the nucleotide sequence of SEQ ID NO: 13 comprising at least one modification, where the 
modified nucleotide sequence encodes a polypeptide which consists of amino acids 1 to 447 
of SEQ ID NO:14, 

the nucleotide sequence of SEQ ID NO: 15 comprising at least one modification, where the 
modified nucleotide sequence encodes a polypeptide which consists of amino acids 1 to 452 
35 of SEQ ID NO:16, 

the nucleotide sequence of SEQ ID NO:37 comprising at least one modification, where the 
modified nucleotide sequence encodes a polypeptide which consists of amino acids 1 to 454 
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of SEQ ID NO:38, 

the nucleotide sequence of SEQ ID NO:39 comprising at least one modification, where the 
modified nucleotide sequence encodes a polypeptide which consists of amino acids 1 to 458 
of SEQ ID NO:40, 

5 the nucleotide sequence of SEQ ID NO:41 comprising at least one modification, where the 
modified nucleotide sequence encodes a polypeptide which consists of amino acids 1 to 450 
of SEQ ID NO:42, 

the nucleotide sequence of SEQ ID NO:43 comprising at least one modification, where the 
modified nucleotide sequence encodes a polypeptide which consists of amino acids 1 to 446 
10 of SEQ ID NO:44, 

the nucleotide sequence of SEQ ID NO:45 comprising at least one modification, where the 
modified nucleotide sequence encodes a polypeptide which consists of amino acids 1 to 527 
of SEQ ID NO:46, 

the nucleotide sequence of SEQ ID NO:47 comprising at least one modification, where the 
15 modified nucleotide sequence encodes a polypeptide which consists of amino acids 1 to 455 
of SEQ ID NO:48, 

the nucleotide sequence of SEQ ID NO:49 comprising at least one modification, where the 
modified nucleotide sequence encodes a polypeptide which consists of amino acids 1 to 464 
of SEQ ID NO:50, 

20 the nucleotide sequence of SEQ ID NO:51 comprising at least one modification, where the 
modified nucleotide sequence encodes a polypeptide which consists of amino acids 1 to 460 
of SEQ ID NO:52, 

the nucleotide sequence of SEQ ID NO:53 comprising at least one modification, where the 
modified nucleotide sequence encodes a polypeptide which consists of amino acids 1 to 450 
25 of SEQ ID NO:54, 

the nucleotide sequence of SEQ ID NO:55 comprising at least one modification, where the 
modified nucleotide sequence encodes a polypeptide which consists of amino acids 1 to 532 
of SEQ ID NO:56, 

the nucleotide sequence of SEQ ID NO:57 comprising at least one modification, where the 
30 modified nucleotide sequence encodes a polypeptide which consists of amino acids 1 to 460 
of SEQ ID NO:58, 

the nucleotide sequence of SEQ ID NO:59 comprising at least one modification, where the 
modified nucleotide sequence encodes a polypeptide which consists of amino acids 1 to 525 
of SEQ ID NO:60, and 

35 the nucleotide sequence of SEQ ID NO:65 comprising at least one modification, where the 
modified nucleotide sequence encodes a polypeptide which consists of amino acids 1 to 456 
of SEQ ID NO:66. 
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21. A polypeptide having cellobiohydrolase I activity which is encoded by the cellobiohydrolase 
I encoding part of the nucleotide sequence present in a microorganism selected from the 
group consisting of: 

5 a microorganism belonging to Zygomycota, preferably belonging to the Mucorales, more 
preferably belonging to the family Mucoraceae or the family Choanephoraceae, most 
preferably belonging to the genus Rhizomucor or the genus Poitrasia, in particular Rhizomucor 
pusillus or Poitrasia circinans, 

a microorganism belonging to the Oomycetes, preferably belonging to the order Pythiales, 
10 more preferably belonging to the family Pythiaceae, most preferably belonging to the genus 
Phytophthora, in particular Phytophthora infestans, 

a microorganism belonging to Auriculariales, preferably belonging to the family Exidiaceae, 

more preferably belonging to the genus Exidia, most preferably Exidia glandulosa, 

a microorganism belonging to Xylariales, preferably belonging to the family Xylariaceae, more 

15 preferably belonging to the genus Xylaria, most preferably Xylaria hypoxylon, 

a microorganism belonging to Dothideales, preferably belonging to the family Dothideaceae, 
more preferably belonging to the genus Diplodia, most preferably Diplodia gossypina, 
a microorganism belonging to Pezizales, preferably belonging to the family Pyronemataceae 
or the family Sarcosomataceae, more preferably belonging to the genus Trichophaea or the 

20 genus Pseudoplectania, most preferably Trichophaea saccata or Pseudoplectania nigrella, 

a microorganism belonging to the family Rigidiporaceae, preferably belonging to the genus 
Meripilus, more preferably Meripilus giganteus, 

a microorganism belonging to the family Meruliaceae, preferably belonging to the genus 
Sporothrichum, more preferably Sporotrichum pruinosum, 
25 a microorganism belonging to the family Agaricaceae (under Basidiomycota, Hymenomycetes, 
Agaricales), more preferably belonging to the genus Coprinus, most preferably Coprinus 
cinereus, 

a microorganism belonging to the family Hypocreaceae, preferably belonging to the genus 
Acremonium or the genus Verticillium, more preferably Acremonium thermophilum or 
30 Verticillium tenerum, 

a microorganism belonging to the genus Cladorrhinum, preferably Cladorrhinum 
foecundissimum, 

a microorganism belonging to the genus Myceliophthora, preferably Myceliophthora 
thermophila or Myceliophthora hinnulea, 
35 a microorganism belonging to the genus Chaetomium, preferably Chaetomium thermophilum, 
a microorganism belonging to the genus Chaetomidium, preferably Chaetomidium 
pingtungium, 
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a microorganism belonging to the genus Thielavia, preferably Thielavia australiensis or 
Thielavia microspora, 

a microorganism belonging to the genus Thermoascus, preferably Thermoascus aurantiacus, 
a microorganism belonging to the genus Trichothecium, preferably Trichothecium roseum, and 
a microorganism belonging to the species Humicola nigrescens. 

22. A method for shuffling of DNA comprising using the polynucleotide as defined in any of 
claims 9 and 16-20. 

23. A polynucleotide encoding a polypeptide having cellobiase activity obtainable by the 
method of claim 22. 

24. A polypeptide having cellobiase activity encoded by the polynucleotide of claim 23. 

25. Use of the polynucleotide as defined in any of claims 9 and 16-20 for DNA shuffling. 

26. A method for producing ethanol from biomass, comprising contacting the biomass with the 
polypeptide as defined in any of claims 1-8. 

27. Use of the polypeptide as defined in any of claims 1-8 for producing ethanol. 

28. A transgenic plant, plant part or plant cell, which has been transformed with a nucleotide 
sequence encoding a polypeptide having cellobiohydrolase I activity as defined in any of 
claims 1-8. 

29. A detergent composition comprising a surfactant and the polypeptide according to any of 
claims 1-8. 



107 



WO 03/000941 PCT/DK02/00429 

SEQUENCE LISTING 

<110> Novozymes A/S 

<120> Polypeptides having cellobiohydrolase I activity and polynucleotides 
encoding same 

<130> 10129-WO 

<160> 67 

<170> Patentln version 3.1 

<210> 1 

<211> 1581 

<212> DNA 

<213> Acremonium thermophilum 
<220> 

<221> CDS 

<222> (1) . . (1581) 

<223> 

<400> 1 

atg cac gcc aag ttc gcg acc etc gec gec ctt gtg gcg tec gec gcg 48 

Met His Ala Lys Phe Ala Thr Leu Ala Ala Leu Val Ala Ser Ala Ala 
15 10 15 

gcc cag cag gcc tgc aca etc acg get gag aac cac ccc acc ctg teg 96 
Ala Gin Gin Ala Cys Thr Leu Thr Ala Glu Asn His Pro Thr Leu Ser 

20 25 30 

tgg tec aag tgc acg tec ggc ggc age tgc acc age gtc teg ggc tec 144 
Trp Ser Lys Cys Thr Ser Gly Gly Ser Cys Thr Ser Val Ser Gly Ser 
35 40 45 

gtc acc ate gat gcc aac tgg egg tgg act cac cag gtc teg age teg 192 
Val Thr lie Asp Ala Asn Trp Arg Trp Thr His Gin Val Ser Ser Ser 
50 55 60 

acc aac tgc tac acg ggc aat gag tgg gac acg tec ate tgc acc gac 240 
Thr Asn Cys Tyr Thr Gly Asn Glu Trp Asp Thr Ser lie Cys Thr Asp 
65 70 75 80 

ggt get teg tgc gcc gcc gcc tgc tgc etc gat ggc gcc gac tac teg 288 
Gly Ala Ser Cys Ala Ala Ala Cys Cys Leu Asp Gly Ala Asp Tyr Ser 

85 90 95 

ggc acc tat ggc ate acc acc age ggc aac gcc etc age etc cag ttc 336 
Gly Thr Tyr Gly lie Thr Thr Ser Gly Asn Ala Leu Ser Leu Gin Phe 

100 105 110 

gtc act cag ggc ccc tac teg acc aac att ggc teg cgt acc tac ctg 384 
Val Thr Gin Gly Pro Tyr Ser Thr Asn lie Gly Ser Arg Thr Tyr Leu 
115 120 125 

atg gcc teg gac acc aag tac cag atg ttc act ctg etc ggc aac gag 432 
Met Ala Ser Asp Thr Lys Tyr Gin Met Phe Thr Leu Leu Gly Asn Glu 
130 135 140 



ttc acc ttc gac gtg gac gtc aca ggc etc ggc tgc ggt ctg aac ggc 480 
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Phe Thr Phe Asp Val Asp Val Thr Gly Leu Gly Cys Gly Leu Asn Gly 
145 150 155 160 

gcc etc tac ttc gtc tec atg gac gag gac ggt ggt ctt tec aag tac 528 
Ala Leu Tyr Phe Val Ser Met Asp Glu Asp Gly Gly Leu Ser Lys Tyr 

165 170 175 

teg ggc aac aag get ggc gcc aag tac ggc acc ggc tac tgc gac teg 57 6 

Ser Gly Asn Lys Ala Gly Ala Lys Tyr Gly Thr Gly Tyr Cys Asp Ser 

180 185 190 

cag tgc ccc cgc gac etc aag ttc ate aac ggc gag get aac aac gtt 624 
Gin Cys Pro Arg Asp Leu Lys Phe lie Asn Gly Glu Ala Asn Asn Val 
195 200 205 

9gc tgg acc ccg teg tec aac gac aag aac gcc ggc ttg ggc aac tac 672 
Gly Trp Thr Pro Ser Ser Asn Asp Lys Asn Ala Gly Leu Gly Asn Tyr 
210 215 220 

ggc age tgc tgc tec gag atg gat gtc tgg gag gcc aac age ate teg 72 0 

Gly Ser Cys Cys Ser Glu Met Asp Val Trp Glu Ala Asn Ser He Ser 
225 230 235 240 

gcg gcc tac acg ccc cat cct tgc act acc ate ggc cag acg cgc tgc 768 
Ala Ala Tyr Thr Pro His Pro Cys Thr Thr He Gly Gin Thr Arg Cys 

245 250 255 

gag ggc gac gac tgc ggt ggt acc tac age act gac cgc tac gcc ggc 816 
Glu Gly Asp Asp Cys Gly Gly Thr Tyr Ser Thr Asp Arg Tyr Ala Gly 

260 265 270 

gag tgc gac cct gac gga tgc gac ttc aac teg tac cgc atg ggc aac 8 64 

Glu Cys Asp Pro Asp Gly Cys Asp Phe Asn Ser Tyr Arg Met Gly Asn 
275 280 285 

acg acc ttc tac ggc aag ggc atg acc gtc gac acc age aag aag ttc 912 
Thr Thr Phe Tyr Gly Lys Gly Met Thr Val Asp Thr Ser Lys Lys Phe 
290 295 300 

acg gtg gtg acc cag ttc ctg acg gac teg tct ggc aac ctg tec gag 960 
Thr Val Val Thr Gin Phe Leu Thr Asp Ser Ser Gly Asn Leu Ser Glu 
305 310 315 320 

ate aag cgc ttc tac gtc cag aac ggc gtc gtc att ccc aac teg aac 1008 
He Lys Arg Phe Tyr Val Gin Asn Gly Val Val He Pro Asn Ser Asn 

325 330 335 

tec aac ate gcg ggc gtc teg ggc aac tec ate acc cag gcc ttc tgc 1056 
Ser Asn He Ala Gly Val Ser Gly Asn Ser He Thr Gin Ala Phe Cys 

340 345 350 

gat get cag aag acc get ttc ggc gac acc aac gtc ttc gac caa aag 1104 
Asp Ala Gin Lys Thr Ala Phe Gly Asp Thr Asn Val Phe Asp Gin Lys 
355 360 365 

gg c ggc ctg gcc cag atg ggc aag get ctt gcc cag ccc atg gtc etc 1152 
Gly Gly Leu Ala Gin Met Gly Lys Ala Leu Ala Gin Pro Met Val Leu 
370 375 380 

gtc atg tec etc tgg gac gac cac gcc gtc aac atg etc tgg etc gac 12 00 

Val Met Ser Leu Trp Asp Asp His Ala Val Asn Met Leu Trp Leu Asp 
385 390 395 400 
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teg acc tac 
Ser Thr Tyr 



acc tgc ccc 
Thr Cys Pro 



ccc aac tec 
Pro Asn Ser 
435 

tec acc gtc 
Ser Thr Val 
450 

tec age tec 
Ser Ser Ser 
465 

teg gee age 
Ser Ala Ser 



cag tgc ggc 
Gin Cys Gly 



tac acc tgc 
Tyr Thr Cys 
515 



ccg acc aac 
Pro Thr Asn 
405 

acc acc teg 
Thr Thr Ser 
420 

aag gtc ate 
Lys Val lie 



tec ggc ctg 
Ser Gly Leu 



acc acc acc 
Thr Thr Thr 
470 

tec ggc ccg 
Ser Gly Pro 
485 

ggc ate ggc 
Gly He Gly 
500 

cag aag ctg 
Gin Lys Leu 



gcg gee ggc 
Ala Ala Gly 



ggc gtc ccc 
Gly Val Pro 
425 

tac tec aac 
Tyr Ser Asn 
440 

ccc ggc ggc 
Pro Gly Gly 
455 

acc acc aga 
Thr Thr Arg 



acc ggc ggt 
Thr Gly Gly 



tgg acc ggc 
Trp Thr Gly 
505 

aac gac tgg 
Asn Asp Trp 
520 



aag ccg ggc 
Lys Pro Gly 
410 

gec gac gtc 
Ala Asp val 



ate cgc ttc 
He Arg Phe 



ggc age aac 
Gly Ser Asn 
460 

ccc gee acc 
Pro Ala Thr 
475 

ggc acg get 
Gly Thr Ala 
490 

ccg acc gtc 
Pro Thr Val 



tac tac cag 
Tyr Tyr Gin 



gec gee cgc 
Ala Ala Arg 
415 

gag tec cag 
Glu Ser Gin 
430 

ggc ccc ate 
Gly Pro He 
445 

ccc ggc ggc 
Pro Gly Gly 



tec acc acc 
Ser Thr Thr 



gec cac tgg 
Ala His Trp 
495 

tgc gec teg 
Cys Ala Ser 
510 

tgc etc taa 
Cys Leu 
525 



ggt 1248 
Gly 



gcg 1296 
Ala 



ggc 1344 
Gly 



ggc 13 92 

Gly 



tec 1440 

Ser 

480 

ggc 1488 
Gly 



ccc 1536 
Pro 



1581 



<210> 2 

<211> 526 

<212> PRT 

<213> Acremonium thermophilum 



<400> 2 



Met His Ala Lys 
1 

Ala Gin Gin Ala 

20 

Trp Ser Lys Cys 
35 

Val Thr He Asp 
50 



Phe Ala Thr Leu 

5 

Cys Thr Leu Thr 



Thr Ser Gly Gly 

40 

Ala Asn Trp Arg 
55 



Ala Ala Leu Val 
10 

Ala Glu Asn His 
25 

Ser Cys Thr Ser 



Trp Thr His Gin 

60 



Ala Ser Ala Ala 
15 

Pro Thr Leu Ser 
30 

Val Ser Gly Ser 
45 

Val Ser Ser Ser 



Thr Asn Cys Tyr 
65 

Gly Ala Ser Cys 



Gly Thr Tyr Gly 

100 



Thr Gly Asn Glu 
70 

Ala Ala Ala Cys 
85 

He Thr Thr Ser 



Trp Asp Thr Ser 
75 

Cys Leu Asp Gly 
90 

Gly Asn Ala Leu 
105 



He Cys Thr Asp 

80 

Ala Asp Tyr Ser 
95 

Ser Leu Gin Phe 
110 
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Val Thr Gin Gly 
115 

Met Ala Ser Asp 
130 

Phe Thr Phe Asp 
145 

Ala Leu Tyr Phe 



Ser Gly Asn Lys 

180 

Gin Cys Pro Arg 
195 

Gly Trp Thr Pro 
210 

Gly Ser Cys Cys 
225 

Ala Ala Tyr Thr 



Glu Gly Asp Asp 

260 

Glu Cys Asp Pro 
275 

Thr Thr Phe Tyr 
290 

Thr Val Val Thr 
305 

lie Lys Arg Phe 



Ser Asn lie Ala 

340 

Asp Ala Gin Lys 
355 

Gly Gly Leu Ala 
370 

Val Met Ser Leu 
385 

Ser Thr Tyr Pro 



Thr Cys Pro Thr 

420 

Pro Asn Ser Lys 
435 



Pro Tyr Ser Thr 

120 

Thr Lys Tyr Gin 
135 

Val Asp Val Thr 
150 

Val Ser Met Asp 
165 

Ala Gly Ala Lys 



Asp Leu Lys Phe 

200 

Ser Ser Asn Asp 
215 

Ser Glu Met Asp 
230 

Pro His Pro Cys 
245 

Cys Gly Gly Thr 



Asp Gly Cys Asp 

280 

Gly Lys Gly Met 
295 

Gin Phe Leu Thr 
310 

Tyr Val Gin Asn 
325 

Gly Val Ser Gly 



Thr Ala Phe Gly 

360 

Gin Met Gly Lys 
375 

Trp Asp Asp His 
390 

Thr Asn Ala Ala 
405 

Thr Ser Gly Val 



Val lie Tyr Ser 

440 



Asn lie Gly Ser 



Met Phe Thr Leu 

140 

Gly Leu Gly Cys 
155 

Glu Asp Gly Gly 
170 

Tyr Gly Thr Gly 
185 

lie Asn Gly Glu 



Lys Asn Ala Gly 

220 

Val Trp Glu Ala 
235 

Thr Thr lie Gly 
250 

Tyr Ser Thr Asp 
265 

Phe Asn Ser Tyr 



Thr Val Asp Thr 

300 

Asp Ser Ser Gly 
315 

Gly Val Val lie 
330 

Asn Ser lie Thr 
345 

Asp Thr Asn Val 



Ala Leu Ala Gin 

380 

Ala Val Asn Met 
395 

Gly Lys Pro Gly 
410 

Pro Ala Asp Val 
425 

Asn lie Arg Phe 



Arg Thr Tyr Leu 
125 

Leu Gly Asn Glu 



Gly Leu Asn Gly 

160 

Leu Ser Lys Tyr 
175 

Tyr Cys Asp Ser 
190 

Ala Asn Asn Val 
205 

Leu Gly Asn Tyr 



Asn Ser lie Ser 

240 

Gin Thr Arg Cys 
255 

Arg Tyr Ala Gly 
270 

Arg Met Gly Asn 
285 

Ser Lys Lys Phe 



Asn Leu Ser Glu 

320 

Pro Asn Ser Asn 
335 

Gin Ala Phe Cys 
350 

Phe Asp Gin Lys 
365 

Pro Met Val Leu 



Leu Trp Leu Asp 

400 

Ala Ala Arg Gly 
415 

Glu Ser Gin Ala 
430 

Gly Pro lie Gly 
445 
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Ser Thr Val Ser 
450 

Ser Ser Ser Thr 
465 

Ser Ala Ser Ser 



Gin Cys Gly Gly 

500 

Tyr Thr Cys Gin 
515 



Gly Leu Pro Gly 
455 

Thr Thr Thr Thr 
470 

Gly Pro Thr Gly 
485 

lie Gly Trp Thr 



Lys Leu Asn Asp 

520 



Gly Gly Ser Asn 

460 

Arg Pro Ala Thr 
475 

Gly Gly Thr Ala 
490 

Gly Pro Thr Val 
505 

Trp Tyr Tyr Gin 



Pro Gly Gly Gly 



Ser Thr Thr Ser 

480 

Ala His Trp Gly 
495 

Cys Ala Ser Pro 
510 

Cys Leu 
525 



<210> 3 

<211> 1590 

<212> DNA 

<213> Chaetomium thermophilum 
<220> 

<221> CDS 

<222> (1) . . (1590) 

<223> 

<400> 3 

atg atg tac aag aag ttc gcc get etc gec gec etc gtg get ggc gee 48 

Met Met Tyr Lys Lys Phe Ala Ala Leu Ala Ala Leu Val Ala Gly Ala 
15 10 15 

gcc gcc cag cag get tgc tec etc ace act gag ace cac ccc aga etc 96 
Ala Ala Gin Gin Ala Cys Ser Leu Thr Thr Glu Thr His Pro Arg Leu 

20 25 30 

act tgg aag cgc tgc ace tct ggc ggc aac tgc teg acc gtg aac ggc 144 
Thr Trp Lys Arg Cys Thr Ser Gly Gly Asn Cys Ser Thr Val Asn Gly 
35 40 45 

gcc gtc acc ate gat gcc aac tgg cgc tgg act cac acc gtt tec ggc 192 
Ala Val Thr lie Asp Ala Asn Trp Arg Trp Thr His Thr Val Ser Gly 
50 55 60 

teg acc aac tgc tac acc ggc aac gag tgg gat acc tec ate tgc tct 240 
Ser Thr Asn Cys Tyr Thr Gly Asn Glu Trp Asp Thr Ser lie Cys Ser 
65 70 75 80 

gat ggc aag age tgc gcc cag acc tgc tgc gtc gac ggc get gac tac 28 8 

Asp Gly Lys Ser Cys Ala Gin Thr Cys Cys Val Asp Gly Ala Asp Tyr 

85 90 95 

tct teg acc tat ggt ate acc acc age ggt gac tec ctg aac etc aag 336 
Ser Ser Thr Tyr Gly He Thr Thr Ser Gly Asp Ser Leu Asn Leu Lys 

100 105 110 

ttc gtc acc aag cac cag tac ggc acc aat gtc ggc tct cgt gtc tac 384 
Phe Val Thr Lys His Gin Tyr Gly Thr Asn Val Gly Ser Arg Val Tyr 
115 120 ^ 125 



ctg atg gag aac gac acc aag tac cag atg ttc gag etc etc ggc aac 432 
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Leu Met Glu Asn Asp Thr Lys Tyr Gin Met Phe Glu Leu Leu Gly Asn 
130 135 140 

gag ttc acc ttc gat gtc gat gtc tct aac ctg ggc tgc ggt etc aac 480 
Glu Phe Thr Phe Asp Val Asp Val Ser Asn Leu Gly Cys Gly Leu Asn 
145 150 155 160 

ggt gec etc tac ttc gtc tec atg gac get gat ggt ggt atg age aag 528 
Gly Ala Leu Tyr Phe Val Ser Met Asp Ala Asp Gly Gly Met Ser Lys 

165 170 175 

tac tct ggc aac aag get ggc gee aag tac ggg acg ggg tac tgt gat 576 
Tyr Ser Gly Asn Lys Ala Gly Ala Lys Tyr Gly Thr Gly Tyr Cys Asp 

180 185 190 

get cag tgc ccg cgc gac ctt aag ttc ate aac ggc gag gee aac att 624 
Ala Gin Cys Pro Arg Asp Leu Lys Phe lie Asn Gly Glu Ala Asn lie 
195 200 205 

gag aac tgg acc cct teg acc aat gat gee aac gee ggt ttc ggc cgc 672 
Glu Asn Trp Thr Pro Ser Thr Asn Asp Ala Asn Ala Gly Phe Gly Arg 
210 215 220 

tat ggc age tgc tgc tct gag atg gat ate tgg gag gee aac aac atg 720 
Tyr Gly Ser Cys Cys Ser Glu Met Asp lie Trp Glu Ala Asn Asn Met 
225 230 235 240 

get act gee ttc act cct cac cct tgc acc att ate ggc cag age cgc 768 
Ala Thr Ala Phe Thr Pro His Pro Cys Thr lie lie Gly Gin Ser Arg 

245 250 255 

tgc gag ggc aac age tgc ggt ggc acc tac age tct gag cgc tat get 816 
Cys Glu Gly Asn Ser Cys Gly Gly Thr Tyr Ser Ser Glu Arg Tyr Ala 

260 265 270 

ggt gtt tgc gat cct gat ggc tgc gac ttc aac gee tac cgc cag ggc 864 
Gly Val Cys Asp Pro Asp Gly Cys Asp Phe Asn Ala Tyr Arg Gin Gly 
275 280 285 

gac aag acc ttc tac ggc aag ggc atg acc gtc gac acc acc aag aag 912 
Asp Lys Thr Phe Tyr Gly Lys Gly Met Thr Val Asp Thr Thr Lys Lys 
290 295 300 

atg acc gtc gtc acc cag ttc cac aag aac teg get ggc gtc etc age 960 
Met Thr Val Val Thr Gin Phe His Lys Asn Ser Ala Gly Val Leu Ser 
305 310 315 320 

gag ate aag cgc ttc tac gtt cag gac ggc aag gtc att gee aac gec 1008 
Glu He Lys Arg Phe Tyr Val Gin Asp Gly Lys Val He Ala Asn Ala 

325 330 335 

gag tec aag ate ccc ggc aac ccc ggc aac tec ate acc cag gag tgg 1056 
Glu Ser Lys He Pro Gly Asn Pro Gly Asn Ser He Thr Gin Glu Trp 

340 345 350 

tgc gat gee cag aag gtc gec ttc ggt gac ate gat gac ttc aac cgc 1104 
Cys Asp Ala Gin Lys Val Ala Phe Gly Asp He Asp Asp Phe Asn Arg 
355 360 365 

aag ggc ggt atg get cag atg age aag gec etc gaa ggc cct atg gtc 1152 
Lys Gly Gly Met Ala Gin Met Ser Lys Ala Leu Glu Gly Pro Met Val 
370 375 380 
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ctg gtc atg tec gtc tgg gat gac cac tac gec aac atg etc tgg etc 12 00 

Leu Val Met Ser Val Trp Asp Asp His Tyr Ala Asn Met Leu Trp Leu 
385 390 395 400 

gac teg acc tac ccc ate gac aag gec ggc acc ccc ggc gee gag cgc 1248 
Asp Ser Thr Tyr Pro He Asp Lys Ala Gly Thr Pro Gly Ala Glu Arg 

405 410 415 

ggt get tgc ccg acc acc tec ggt gtc cct gec gag att gag gec cag 1296 
Gly Ala Cys Pro Thr Thr Ser Gly Val Pro Ala Glu He Glu Ala Gin 

420 425 430 

gtc ccc aac age aac gtc ate ttc tec aac ate cgc ttc ggc ccc ate 1344 
Val Pro Asn Ser Asn Val He Phe Ser Asn He Arg Phe Gly Pro He 
435 440 445 

ggc teg acc gtc cct ggc etc gac ggc age act ccc age aac ccg acc 13 92 

Gly Ser Thr Val Pro Gly Leu Asp Gly Ser Thr Pro Ser Asn Pro Thr 
450 455 460 

gee acc gtt get cct ccc act tct acc acc age gtg aga age age act 1440 
Ala Thr Val Ala Pro Pro Thr Ser Thr Thr Ser Val Arg Ser Ser Thr 
465 470 475 480 

act cag att tec acc ccg act age cag ccc ggc ggc tgc acc acc cag 14 88 

Thr Gin He Ser Thr Pro Thr Ser Gin Pro Gly Gly Cys Thr Thr Gin 

485 490 495 

aag tgg ggc cag tgc ggt ggt ate ggc tac acc ggc tgc act aac tgc 1536 
Lys Trp Gly Gin Cys Gly Gly He Gly Tyr Thr Gly Cys Thr Asn Cys 

500 505 510 

gtt get ggc act acc tgc act gag etc aac ccc tgg tac age cag tgc 1584 
Val Ala Gly Thr Thr Cys Thr Glu Leu Asn Pro Trp Tyr Ser Gin Cys 
515 520 ~ 525 



ctg taa 
Leu 



1590 



<210> 4 

<211> 529 

<212> PRT 

<213> Chaetomium thermophilum 

<400> 4 

Met Met Tyr Lys Lys Phe Ala Ala 
1 5 

Ala Ala Gin Gin Ala Cys Ser Leu 

20 

Thr Trp Lys Arg Cys Thr Ser Gly 
35 40 

Ala Val Thr He Asp Ala Asn Trp 
50 55 

Ser Thr Asn Cys Tyr Thr Gly Asn 



Leu Ala Ala Leu Val Ala Gly Ala 
10 15 

Thr Thr Glu Thr His Pro Arg Leu 
25 30 

Gly Asn Cys Ser Thr Val Asn Gly 

45 

Arg Trp Thr His Thr Val Ser Gly 

60 

Glu Trp Asp Thr Ser He Cys Ser 
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80 



Asp Gly Lys Ser 



Ser Ser Thr Tyr 

100 

Phe Val Thr Lys 
115 

Leu Met Glu Asn 
130 

Glu Phe Thr Phe 
145 

Gly Ala Leu Tyr 



Tyr Ser Gly Asn 

180 

Ala Gin Cys Pro 
195 

Glu Asn Trp Thr 
210 

Tyr Gly Ser Cys 
225 

Ala Thr Ala Phe 



Cys Glu Gly Asn 

260 

Gly Val Cys Asp 
275 

Asp Lys Thr Phe 
290 

Met Thr Val Val 
305 

Glu lie Lys Arg 



Glu Ser Lys He 

340 

Cys Asp Ala Gin 
355 

Lys Gly Gly Met 
370 

Leu Val Met Ser 
385 



Cys Ala Gin Thr 
85 

Gly He Thr Thr 



His Gin Tyr Gly 

120 

Asp Thr Lys Tyr 
135 

Asp Val Asp Val 
150 

Phe Val Ser Met 
165 

Lys Ala Gly Ala 



Arg Asp Leu Lys 

200 

Pro Ser Thr Asn 
215 

Cys Ser Glu Met 
230 

Thr Pro His Pro 
245 

Ser Cys Gly Gly 



Pro Asp Gly Cys 

280 

Tyr Gly Lys Gly 
295 

Thr Gin Phe His 
310 

Phe Tyr Val Gin 
325 

Pro Gly Asn Pro 



Lys Val Ala Phe 

360 

Ala Gin Met Ser 
375 

Val Trp Asp Asp 
390 



Cys Cys Val Asp 
90 

Ser Gly Asp Ser 
105 

Thr Asn Val Gly 



Gin Met Phe Glu 

140 

Ser Asn Leu Gly 
155 

Asp Ala Asp Gly 
170 

Lys Tyr Gly Thr 
185 

Phe lie Asn Gly 



Asp Ala Asn Ala 

220 

Asp He Trp Glu 
235 

Cys Thr He He 
250 

Thr Tyr Ser Ser 
265 

Asp Phe Asn Ala 



Met Thr Val Asp 

300 

Lys Asn Ser Ala 
315 

Asp Gly Lys Val 
330 

Gly Asn Ser He 
345 

Gly Asp He Asp 



Lys Ala Leu Glu 

380 

His Tyr Ala Asn 
395 



Gly Ala Asp Tyr 
95 

Leu Asn Leu Lys 
110 

Ser Arg Val Tyr 
125 

Leu Leu Gly Asn 



Cys Gly Leu Asn 

160 

Gly Met Ser Lys 
175 

Gly Tyr Cys Asp 
190 

Glu Ala Asn lie 
205 

Gly Phe Gly Arg 



Ala Asn Asn Met 

240 

Gly Gin Ser Arg 
255 

Glu Arg Tyr Ala 
270 

Tyr Arg Gin Gly 
285 

Thr Thr Lys Lys 



Gly Val Leu Ser 

320 

He Ala Asn Ala 
335 

Thr Gin Glu Trp 
350 

Asp Phe Asn Arg 
365 

Gly Pro Met Val 



Met Leu Trp Leu 

400 
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Asp Ser Thr Tyr Pro lie Asp Lys Ala Gly Thr Pro Gly Ala Glu Arg 

405 410 415 

Gly Ala Cys Pro Thr Thr Ser Gly Val Pro Ala Glu He Glu Ala Gin 

420 425 430 

Val Pro Asn Ser Asn Val He Phe Ser Asn He Arg Phe Gly Pro He 
435 440 445 

Gly Ser Thr Val Pro Gly Leu Asp Gly Ser Thr Pro Ser Asn Pro Thr 
450 455 460 

Ala Thr Val Ala Pro Pro Thr Ser Thr Thr Ser Val Arg Ser Ser Thr 
465 470 475 480 

Thr Gin He Ser Thr Pro Thr Ser Gin Pro Gly Gly Cys Thr Thr Gin 

485 490 495 

Lys Trp Gly Gin Cys Gly Gly He Gly Tyr Thr Gly Cys Thr Asn Cys 

500 505 510 

Val Ala Gly Thr Thr Cys Thr Glu Leu Asn Pro Trp Tyr Ser Gin Cys 
515 520 525 

Leu 



<210> 5 

<211> 1356 

<212> DNA 

<213> Scytalidium sp . 
<220> 

<221> CDS 

<222> (1) . . (1356) 

<223> 

<400> 5 

atg cag ate aag age tac ate cag tac ctg gec gcg get ctg ccg etc 48 
Met Gin lie Lys Ser Tyr lie Gin Tyr Leu Ala Ala Ala Leu Pro Leu 
15 10 15 

ctg age age gtc get gee cag cag gee ggc acc ate acc gee gag aac 96 
Leu Ser Ser Val Ala Ala Gin Gin Ala Gly Thr He Thr Ala Glu Asn 

20 25 30 



cac ccc agg atg acc tgg 
His Pro Arg Met Thr Trp 
35 

acc gtg cag ggc gag gtc 
Thr Val Gin Gly Glu Val 
50 

aac aac ggc cag aac tgc 
Asn Asn Gly Gin Asn Cys 
65 70 

age teg gee acc gac tgc 
Ser Ser Ala Thr Asp Cys 



aag agg tgc teg ggc ccc 
Lys Arg Cys Ser Gly Pro 
40 

gtc ate gac gee aac tgg 
Val lie Asp Ala Asn Trp 
55 60 

tat gag ggc aac aag tgg 
Tyr Glu Gly Asn Lys Trp 

75 

gcg cag agg tgc gee etc 
Ala Gin Arg Cys Ala Leu 



ggc aac tgc cag 144 

Gly Asn Cys Gin 

45 

cgc tgg ctg cac 192 
Arg Trp Leu His 



acc age cag tgc 24 0 

Thr Ser Gin Cys 

80 

gac ggt gec aac 288 
Asp Gly Ala Asn 
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85 90 95 

tac cag teg acc tac ggc gec teg acc age ggc gac tec ctg acg etc 336 
Tyr Gin Ser Thr Tyr Gly Ala Ser Thr Ser Gly Asp Ser Leu Thr Leu 

100 105 110 

aag ttc gtc acc aag cac gag tac ggc acc aac ate ggc teg cgc ttc 384 
Lys Phe Val Thr Lys His Glu Tyr Gly Thr Asn He Gly Ser Arg Phe 
115 120 125 

tac etc atg gee aac cag aac aag tac cag atg ttc acc ctg atg aac 432 
Tyr Leu Met Ala Asn Gin Asn Lys Tyr Gin Met Phe Thr Leu Met Asn 
130 135 140 

aac gag ttc gee ttc gat gtc gac etc tec aag gtt gag tgc ggt ate 48 0 

Asn Glu Phe Ala Phe Asp Val Asp Leu Ser Lys Val Glu Cys Gly He 
145 150 155 160 

aac age get ctg tac ttc gtc gee atg gag gag gat ggt ggc atg gee 52 8 

Asn Ser Ala Leu Tyr Phe Val Ala Met Glu Glu Asp Gly Gly Met Ala 

165 170 175 

age tac ccg age aac cgt get ggt gee aag tac ggc acg ggc tac tgc 576 
Ser Tyr Pro Ser Asn Arg Ala Gly Ala Lys Tyr Gly Thr Gly Tyr Cys 

180 185 190 

gat gee caa tgc gec cgt gac etc aag ttc att ggc ggc aag gee aac 624 
Asp Ala Gin Cys Ala Arg Asp Leu Lys Phe He Gly Gly Lys Ala Asn 
195 200 205 

att gag ggc tgg cgc ccg tec acc aac gac ccc aac gec ggt gtc ggt 672 
He Glu Gly Trp Arg Pro Ser Thr Asn Asp Pro Asn Ala Gly Val Gly 
210 215 220 

ccc atg ggt gee tgc tgc get gag ate gac gtt tgg gag tec aac gee 72 0 

Pro Met Gly Ala Cys Cys Ala Glu He Asp Val Trp Glu Ser Asn Ala 
225 230 235 240 

tat get tat gec ttc acc ccc cac gee tgc ggc age aag aac cgc tac 768 
Tyr Ala Tyr Ala Phe Thr Pro His Ala Cys Gly Ser Lys Asn Arg Tyr 

245 250 255 

cac ate tgc gag acc aac aac tgc ggt ggt acc tac teg gat gac cgc 816 
His He Cys Glu Thr Asn Asn Cys Gly Gly Thr Tyr Ser Asp Asp Arg 

260 265 270 

ttc gee ggc tac tgc gac gee aac ggc tgc gac tac aac ccc tac cgc 864 
Phe Ala Gly Tyr Cys Asp Ala Asn Gly Cys Asp Tyr Asn Pro Tyr Arg 
275 280 285 

atg ggc aac aag gac ttc tat ggc aag ggc aag acc gtc gac acc aac 912 
Met Gly Asn Lys Asp Phe Tyr Gly Lys Gly Lys Thr Val Asp Thr Asn 
290 295 300 

cgc aag ttc acc gtt gtc tec cgc ttc gag cgt aac agg etc tct cag 96 0 

Arg Lys Phe Thr Val Val Ser Arg Phe Glu Arg Asn Arg Leu Ser Gin 
305 310 315 320 

ttc ttc gtc cag gac ggc cgc aag ate gag gtg ccc cct ccg acc tgg 1008 
Phe Phe Val Gin Asp Gly Arg Lys He Glu Val Pro Pro Pro Thr Trp 

325 330 335 
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ccc ggc etc ccg aac age gec gac ate ace cct gag etc tgc gat get 1056 
Pro Gly Leu Pro Asn Ser Ala Asp He Thr Pro Glu Leu Cys Asp Ala 

340 345 350 

cag ttc cgc gtc ttc gat gac cgc aac cgc ttc gee gag ace ggt ggc 1104 
Gin Phe Arg Val Phe Asp Asp Arg Asn Arg Phe Ala Glu Thr Gly Gly 
355 360 365 

ttc gat get ctg aac gag gee etc ace att ccc atg gtc ctt gtc atg 1152 
Phe Asp Ala Leu Asn Glu Ala Leu Thr He Pro Met Val Leu Val Met 
370 375 380 

tec ate tgg gat gac cac cac tec aac atg etc tgg etc gac tec age 1200 
Ser He Trp Asp Asp His His Ser Asn Met Leu Trp Leu Asp Ser Ser 
385 390 395 400 

tac ccg ccc gag aag gec ggc etc ccc ggt ggc gac cgt ggc ccg tgc 1248 
Tyr Pro Pro Glu Lys Ala Gly Leu Pro Gly Gly Asp Arg Gly Pro Cys 

405 410 415 

ccg acc acc tct ggt gtc cct gee gag gtc gag get cag tac ccc gat 12 96 

Pro Thr Thr Ser Gly Val Pro Ala Glu Val Glu Ala Gin Tyr Pro Asp 

420 425 430 

get cag gtc gtc tgg tec aac ate cgc ttc ggc ccc ate ggc teg acc 1344 
Ala Gin Val Val Trp Ser Asn He Arg Phe Gly Pro lie Gly Ser Thr 
435 440 445 

gtc aac gtc taa 1356 
Val Asn Val 
450 



<210> 6 

<211> 451 

<212> PRT 

<213> Scytalidium sp . 



<400> 6 



Met Gin He Lys 

1 

Leu Ser Ser Val 

20 

His Pro Arg Met 
35 

Thr Val Gin Gly 
50 

Asn Asn Gly Gin 
65 



Ser Tyr He Gin 
5 

Ala Ala Gin Gin 



Thr Trp Lys Arg 

40 

Glu Val Val lie 
55 

Asn Cys Tyr Glu 
70 



Tyr Leu Ala Ala 
10 

Ala Gly Thr He 
25 

Cys Ser Gly Pro 



Asp Ala Asn Trp 

60 

Gly Asn Lys Trp 
75 



Ala Leu Pro Leu 
15 

Thr Ala Glu Asn 
30 

Gly Asn Cys Gin 
45 

Arg Trp Leu His 



Thr Ser Gin Cys 

80 



Ser Ser Ala Thr Asp Cys Ala Gin 

85 

Tyr Gin Ser Thr Tyr Gly Ala Ser 

100 

Lys Phe Val Thr Lys His Glu Tyr 



Arg Cys Ala Leu Asp Gly Ala Asn 
90 95 

Thr Ser Gly Asp Ser Leu Thr Leu 
105 110 

Gly Thr Asn He Gly Ser Arg Phe 
11 
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Tyr Leu Met Ala 
130 

Asn Glu Phe Ala 
145 

Asn Ser Ala Leu 



Ser Tyr Pro Ser 

180 

Asp Ala Gin Cys 
195 

lie Glu Gly Trp 
210 

Pro Met Gly Ala 
225 

Tyr Ala Tyr Ala 



His lie Cys Glu 

260 

Phe Ala Gly Tyr 
275 

Met Gly Asn Lys 
290 

Arg Lys Phe Thr 
305 

Phe Phe Val Gin 



Pro Gly Leu Pro 

340 

Gin Phe Arg Val 
355 

Phe Asp Ala Leu 
370 

Ser lie Trp Asp 
385 

Tyr Pro Pro Glu 



Pro Thr Thr Ser 

420 

Ala Gin Val Val 
435 



Asn Gin Asn Lys 
135 

Phe Asp Val Asp 
150 

Tyr Phe Val Ala 
165 

Asn Arg Ala Gly 



Ala Arg Asp Leu 

200 

Arg Pro Ser Thr 
215 

Cys Cys Ala Glu 
230 

Phe Thr Pro His 
245 

Thr Asn Asn Cys 



Cys Asp Ala Asn 

280 

Asp Phe Tyr Gly 
295 

Val Val Ser Arg 
310 

Asp Gly Arg Lys 
325 

Asn Ser Ala Asp 



Phe Asp Asp Arg 

360 

Asn Glu Ala Leu 
375 

Asp His His Ser 
390 

Lys Ala Gly Leu 
405 

Gly Val Pro Ala 



Trp Ser Asn lie 

440 



Tyr Gin Met Phe 

140 

Leu Ser Lys Val 
155 

Met Glu Glu Asp 
170 

Ala Lys Tyr Gly 
185 

Lys Phe lie Gly 



Asn Asp Pro Asn 

220 

lie Asp Val Trp 
235 

Ala Cys Gly Ser 
250 

Gly Gly Thr Tyr 
265 

Gly Cys Asp Tyr 



Lys Gly Lys Thr 

300 

Phe Glu Arg Asn 
315 

lie Glu Val Pro 
330 

lie Thr Pro Glu 
345 

Asn Arg Phe Ala 



Thr lie Pro Met 

380 

Asn Met Leu Trp 
395 

Pro Gly Gly Asp 
410 

Glu Val Glu Ala 
425 

Arg Phe Gly Pro 



Thr Leu Met Asn 



Glu Cys Gly lie 

160 

Gly Gly Met Ala 
175 

Thr Gly Tyr Cys 
190 

Gly Lys Ala Asn 
205 

Ala Gly Val Gly 



Glu Ser Asn Ala 

240 

Lys Asn Arg Tyr 
255 

Ser Asp Asp Arg 
270 

Asn Pro Tyr Arg 
285 

Val Asp Thr Asn 



Arg Leu Ser Gin 

320 

Pro Pro Thr Trp 
335 

Leu Cys Asp Ala 
350 

Glu Thr Gly Gly 
365 

Val Leu Val Met 



Leu Asp Ser Ser 

400 

Arg Gly Pro Cys 
415 

Gin Tyr Pro Asp 
430 

lie Gly Ser Thr 
445 
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Val Asn Val 
450 



<210> 


7 


<211> 


1374 


<212> 


DNA 


<213> 


Thermoascus 


<220> 




<221> 


CDS 


<222> 


(1) . . (1374) 


<223> 




<400> 


7 



atg tat cag cgc get ctt etc ttc tct ttc ttc etc tec gec gec cgc 48 
Met Tyr Gin Arg Ala Leu Leu Phe Ser Phe Phe Leu Ser Ala Ala Arg 
15 10 15 

gcg cag cag gec ggt ace eta acc gca gag aat cac cct tec ctg acc 96 
Ala Gin Gin Ala Gly Thr Leu Thr Ala Glu Asn His Pro Ser Leu Thr 

20 25 30 

tgg cag caa tgc tec age ggc ggt agt tgt acc acg cag aat gga aaa 144 
Trp Gin Gin Cys Ser Ser Gly Gly Ser Cys Thr Thr Gin Asn Gly Lys 
35 40 45 

gtc gtt ate gat gcg aac tgg cgt tgg gtc cat acc acc tct gga tac 192 
Val Val lie Asp Ala Asn Trp Arg Trp Val His Thr Thr Ser Gly Tyr 
50 55 60 

acc aac tgc tac acg ggc aat acg tgg gac acc agt ate tgt ccc gac 240 
Thr Asn Cys Tyr Thr Gly Asn Thr Trp Asp Thr Ser lie Cys Pro Asp 
65 70 75 80 

gac gtg acc tgc get cag aat tgt gee ttg gat gga gcg gat tac agt 288 
Asp Val Thr Cys Ala Gin Asn Cys Ala Leu Asp Gly Ala Asp Tyr Ser 

85 90 95 

ggc acc tat ggt gtt acg acc agt ggc aac gee ctg aga ctg aac ttt 336 
Gly Thr Tyr Gly Val Thr Thr Ser Gly Asn Ala Leu Arg Leu Asn Phe 

100 105 110 

gtc acc caa age tea ggg aag aac att ggc teg cgc ctg tac ctg ctg 384 
Val Thr Gin Ser Ser Gly Lys Asn He Gly Ser Arg Leu Tyr Leu Leu 
115 120 125 

cag gac gac acc act tat cag ate ttc aag ctg ctg ggt cag gag ttt 432 
Gin Asp Asp Thr Thr Tyr Gin lie Phe Lys Leu Leu Gly Gin Glu Phe 
130 135 140 

acc ttc gat gtc gac gtc tec aat etc cct tgc ggg ctg aac ggc gee 480 
Thr Phe Asp Val Asp Val Ser Asn Leu Pro Cys Gly Leu Asn Gly Ala 
145 150 155 160 

etc tac ttt gtg gee atg gac gee gac ggc gga ttg tec aaa tac cct 528 
Leu Tyr Phe Val Ala Met Asp Ala Asp Gly Gly Leu Ser Lys Tyr Pro 

165 170 175 

ggc aac aag gca ggc get aag tat ggc act ggt tac tgc gac tct cag 576 
Gly Asn Lys Ala Gly Ala Lys Tyr Gly Thr Gly Tyr Cys Asp Ser Gin 
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180 



185 



PCT/DK02/00429 

190 



tgc cct egg gat etc aag ttc ate aac ggt cag gee aac gtt gaa ggc 624 
Cys Pro Arg Asp Leu Lys Phe He Asn Gly Gin Ala Asn Val Glu Gly 
195 200 205 

tgg cag ccg tct gee aac gac cca aat gec ggc gtt ggt aac cac ggt 672 
Trp Gin Pro Ser Ala Asn Asp Pro Asn Ala Gly Val Gly Asn His Gly 
210 215 220 

tec tgc tgc get gag atg gat gtc tgg gaa gec aac age ate tct act 720 
Ser Cys Cys Ala Glu Met Asp Val Trp Glu Ala Asn Ser He Ser Thr 
225 230 235 240 

gcg gtg acg cct cac cca tgc gac ace ccc ggc cag ace atg tgc cag 768 
Ala Val Thr Pro His Pro Cys Asp Thr Pro Gly Gin Thr Met Cys Gin 

245 250 255 

gga gac gac tgt ggt gga ace tac tec tec act cga tat get ggt ace 816 
Gly Asp Asp Cys Gly Gly Thr Tyr Ser Ser Thr Arg Tyr Ala Gly Thr 

260 265 270 

tgc gac cct gat ggc tgc gac ttc aat cct tac cgc cag ggc aac cac 8 64 

Cys Asp Pro Asp Gly Cys Asp Phe Asn Pro Tyr Arg Gin Gly Asn His 
275 280 285 

teg ttc tac ggc ccc ggg aag ate gtc gac act age tec aaa ttc ace 912 
Ser Phe Tyr Gly Pro Gly Lys He Val Asp Thr Ser Ser Lys Phe Thr 
290 295 300 

gtc gtc ace cag ttc ate ace gac gac ggg ace ccc tec ggc acc ctg 960 
Val Val Thr Gin Phe He Thr Asp Asp Gly Thr Pro Ser Gly Thr Leu 
305 310 315 320 

acg gag ate aaa cgc ttc tac gtc cag aac ggc aag gtg ate ccc cag 1008 
Thr Glu He Lys Arg Phe Tyr Val Gin Asn Gly Lys Val lie Pro Gin 

325 330 335 

teg gag teg acg ate age ggc gtc acc ggc aac tea ate acc acc gag 1056 
Ser Glu Ser Thr He Ser Gly Val Thr Gly Asn Ser lie Thr Thr Glu 

340 345 350 

tat tgc acg gee cag aag gee gee ttc ggc gac aac acc ggc ttc ttc 1104 
Tyr Cys Thr Ala Gin Lys Ala Ala Phe Gly Asp Asn Thr Gly Phe Phe 
355 360 365 

acg cac ggc ggg ctt cag aag ate agt cag get ctg get cag ggc atg 1152 
Thr His Gly Gly Leu Gin Lys lie Ser Gin Ala Leu Ala Gin Gly Met 
370 375 380 

gtc etc gtc atg age ctg tgg gac gat cac gee gee aac atg etc tgg 12 00 

Val Leu Val Met Ser Leu Trp Asp Asp His Ala Ala Asn Met Leu Trp 
385 390 395 400 

ctg gac age acc tac ccg act gat gcg gac ccg gac acc cct ggc gtc 1248 
Leu Asp Ser Thr Tyr Pro Thr Asp Ala Asp Pro Asp Thr Pro Gly Val 

405 410 415 

gcg cgc ggt acc tgc ccc acg acc tec ggc gtc ccg gee gac gtt gag 1296 
Ala Arg Gly Thr Cys Pro Thr Thr Ser Gly Val Pro Ala Asp Val Glu 

420 425 430 
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teg cag aac ccc aat tea tat gtt ate tac tec aac ate aag gtc gga 1344 
Ser Gin Asn Pro Asn Ser Tyr Val lie Tyr Ser Asn lie Lys Val Gly 
435 440 445 

ccc ate aac teg ace ttc ace gee aac taa 1374 
Pro lie Asn Ser Thr Phe Thr Ala Asn 
450 455 



<210> 8 

<211> 457 

<212> PRT 

<213> Thermoascus aurantiacus 



<400> 8 



Met Tyr Gin Arg 
1 

Ala Gin Gin Ala 

20 

Trp Gin Gin Cys 
35 

Val Val lie Asp 
50 

Thr Asn Cys Tyr 
65 

Asp Val Thr Cys 



Gly Thr Tyr Gly 

100 

Val Thr Gin Ser 
115 

Gin Asp Asp Thr 
130 

Thr Phe Asp Val 
145 

Leu Tyr Phe Val 



Gly Asn Lys Ala 

180 

Cys Pro Arg Asp 
195 

Trp Gin Pro Ser 
210 

Ser Cys Cys Ala 
225 

Ala Val Thr Pro 



Ala Leu Leu Phe 

5 

Gly Thr Leu Thr 



Ser Ser Gly Gly 

40 

Ala Asn Trp Arg 
55 

Thr Gly Asn Thr 
70 

Ala Gin Asn Cys 
85 

Val Thr Thr Ser 



Ser Gly Lys Asn 

120 

Thr Tyr Gin lie 
135 

Asp Val Ser Asn 
150 

Ala Met Asp Ala 
165 

Gly Ala Lys Tyr 



Leu Lys Phe lie 

200 

Ala Asn Asp Pro 
215 

Glu Met Asp Val 
230 

His Pro Cys Asp 



Ser Phe Phe Leu 
10 

Ala Glu Asn His 
25 

Ser Cys Thr Thr 



Trp Val His Thr 

60 

Trp Asp Thr Ser 
75 

Ala Leu Asp Gly 
90 

Gly Asn Ala Leu 
105 

lie Gly Ser Arg 



Phe Lys Leu Leu 

140 

Leu Pro Cys Gly 
155 

Asp Gly Gly Leu 
170 

Gly Thr Gly Tyr 
185 

Asn Gly Gin Ala 



Asn Ala Gly Val 

220 

Trp Glu Ala Asn 
235 

Thr Pro Gly Gin 

15 



Ser Ala Ala Arg 
15 

Pro Ser Leu Thr 
30 

Gin Asn Gly Lys 
45 

Thr Ser Gly Tyr 



lie Cys Pro Asp 

80 

Ala Asp Tyr Ser 
95 

Arg Leu Asn Phe 
110 

Leu Tyr Leu Leu 
125 

Gly Gin Glu Phe 



Leu Asn Gly Ala 

160 

Ser Lys Tyr Pro 
175 

Cys Asp Ser Gin 
190 

Asn Val Glu Gly 
205 

Gly Asn His Gly 



Ser lie Ser Thr 

240 

Thr Met Cys Gin 
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245 



250 



255 



Gly Asp Asp Cys Gly Gly Thr Tyr Ser Ser Thr Arg Tyr Ala Gly Thr 

260 265 270 

Cys Asp Pro Asp Gly Cys Asp Phe Asn Pro Tyr Arg Gin Gly Asn His 
275 280 285 

Ser Phe Tyr Gly Pro Gly Lys He Val Asp Thr Ser Ser Lys Phe Thr 
290 295 300 

Val Val Thr Gin Phe He Thr Asp Asp Gly Thr Pro Ser Gly Thr Leu 
305 310 315 320 

Thr Glu He Lys Arg Phe Tyr Val Gin Asn Gly Lys Val He Pro Gin 

325 330 335 

Ser Glu Ser Thr He Ser Gly Val Thr Gly Asn Ser He Thr Thr Glu 

340 345 350 

Tyr Cys Thr Ala Gin Lys Ala Ala Phe Gly Asp Asn Thr Gly Phe Phe 
355 360 365 

Thr His Gly Gly Leu Gin Lys He Ser Gin Ala Leu Ala Gin Gly Met 
370 375 380 

Val Leu Val Met Ser Leu Trp Asp Asp His Ala Ala Asn Met Leu Trp 

385 390 395 400 

Leu Asp Ser Thr Tyr Pro Thr Asp Ala Asp Pro Asp Thr Pro Gly Val 

405 410 415 

Ala Arg Gly Thr Cys Pro Thr Thr Ser Gly Val Pro Ala Asp Val Glu 

420 425 430 

Ser Gin Asn Pro Asn Ser Tyr Val He Tyr Ser Asn He Lys Val Gly 
435 440 445 

Pro He Asn Ser Thr Phe Thr Ala Asn 
450 455 



<210> 9 

<211> 1617 

<212> DNA 

<213> Thielavia australiensis 
<220> 

<221> CDS 

<222> (1) . . (1617) 

<223> 

<400> 9 

atg tat gcc aag ttc gcg acc etc gec gec etc gtg get ggc gee tec 4 8 

Met Tyr Ala Lys Phe Ala Thr Leu Ala Ala Leu Val Ala Gly Ala Ser 
15 10 15 

gcc cag gcc gtc tgc age ctt acc get gag acg cac cct tec ctg acg 96 
Ala Gin Ala Val Cys Ser Leu Thr Ala Glu Thr His Pro Ser Leu Thr 



20 



25 



30 
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tgg cag aag tgc acg gcc ccc ggc age tgc acc aac gtc gec ggc tec 144 
Trp Gin Lys Cys Thr Ala Pro Gly Ser Cys Thr Asn Val Ala Gly Ser 
35 40 45 

ate acc ate gac gcc aac tgg cgc tgg act cac cag acc teg tec gcg 192 
lie Thr He Asp Ala Asn Trp Arg Trp Thr His Gin Thr Ser Ser Ala 
50 55 60 

acc aac tgc tac age ggc age aag tgg gac teg tec ate tgc acg acc 240 
Thr Asn Cys Tyr Ser Gly Ser Lys Trp Asp Ser Ser He Cys Thr Thr 
65 70 75 80 

ggc acc gac tgc gcc tec aag tgc tgc att gat ggc gcc gag tac teg 288 
Gly Thr Asp Cys Ala Ser Lys Cys Cys He Asp Gly Ala Glu Tyr Ser 

85 90 95 

age acc tac ggc ate acc acc age ggc aat gcc ctg aac etc aag ttc 336 
Ser Thr Tyr Gly He Thr Thr Ser Gly Asn Ala Leu Asn Leu Lys Phe 

100 105 110 

gtc acc aag ggc cag tac teg acc aac att ggc teg cgt acc tac etc 384 
Val Thr Lys Gly Gin Tyr Ser Thr Asn He Gly Ser Arg Thr Tyr Leu 
115 120 125 

atg gag teg gac acc aag tac cag atg ttc aag etc ctt ggc aac gag 432 
Met Glu Ser Asp Thr Lys Tyr Gin Met Phe Lys Leu Leu Gly Asn Glu 
130 135 140 

ttc acc ttc gac gtc gat gtc tec aac etc ggc tgc ggc etc aac ggc 480 
Phe Thr Phe Asp Val Asp Val Ser Asn Leu Gly Cys Gly Leu Asn Gly 
145 150 155 160 

gcc ctg tac ttc gtc tec atg gat gcc gac ggt ggc atg tec aag tac 528 
Ala Leu Tyr Phe Val Ser Met Asp Ala Asp Gly Gly Met Ser Lys Tyr 

165 170 175 

teg ggc aac aag gcc ggt gcc aag tac ggt acc ggc tac tgc gat get 576 
Ser Gly Asn Lys Ala Gly Ala Lys Tyr Gly Thr Gly Tyr Cys Asp Ala 

180 185 190 

cag tgc ccc cgc gac etc aag ttc ate aac ggc gag gcc aac gtt gag 624 
Gin Cys Pro Arg Asp Leu Lys Phe He Asn Gly Glu Ala Asn Val Glu 
195 200 205 

ggc tgg gag age teg acc aac gac gcc aac gcc ggc teg ggc aag tac 672 
Gly Trp Glu Ser Ser Thr Asn Asp Ala Asn Ala Gly Ser Gly Lys Tyr 
210 215 220 

ggc age tgc tgc acc gag atg gac gtc tgg gag gcc aac aac atg gcg 72 0 

Gly Ser Cys Cys Thr Glu Met Asp Val Trp Glu Ala Asn Asn Met Ala 
225 230 235 240 

act gcc ttc act cct cac cct tgc acc acc att ggc cag act cgc tgc 768 
Thr Ala Phe Thr Pro His Pro Cys Thr Thr He Gly Gin Thr Arg Cys 

245 250 255 

gag ggc gac acc tgc ggc ggc acc tac age tea gac cgc tac gcc ggc 816 
Glu Gly Asp Thr Cys Gly Gly Thr Tyr Ser Ser Asp Arg Tyr Ala Gly 

260 265 270 

gtc tgc gac ccc gac gga tgc gac ttc aac teg tac cgc cag ggc aac 864 
Val Cys Asp Pro Asp Gly Cys Asp Phe Asn Ser Tyr Arg Gin Gly Asn 
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aag acc ttc tac ggc aag ggc atg acc gtc gac acc acc aag aag ate 912 
Lys Thr Phe Tyr Gly Lys Gly Met Thr Val Asp Thr Thr Lys Lys lie 
290 295 300 

acg gtc gtc acc cag ttc etc aag aac teg gec ggc gag etc tec gag 960 
Thr Val Val Thr Gin Phe Leu Lys Asn Ser Ala Gly Glu Leu Ser Glu 
305 310 315 320 

ate aag cgc ttc tac gee cag gac ggc aag gtc ate ccg aac agt gag 1008 
lie Lys Arg Phe Tyr Ala Gin Asp Gly Lys Val lie Pro Asn Ser Glu 

325 330 335 

tct acc att gee ggc ate ccc ggc aac tec ate acc aag gee tac tgc 1056 
Ser Thr lie Ala Gly lie Pro Gly Asn Ser lie Thr Lys Ala Tyr Cys 

340 345 350 

gac gee cag aag acc gtc ttc cag aac acc gac gac ttc acc gee aag 1104 
Asp Ala Gin Lys Thr Val Phe Gin Asn Thr Asp Asp Phe Thr Ala Lys 
355 360 365 

ggc ggc etc gtc cag atg ggc aag gee etc gee ggc gac atg gtc etc 1152 
Gly Gly Leu Val Gin Met Gly Lys Ala Leu Ala Gly Asp Met Val Leu 
370 375 380 

gtc atg tec gtc tgg gac gac cac gee gtc aac atg etc tgg eta gac 1200 
Val Met Ser Val Trp Asp Asp His Ala Val Asn Met Leu Trp Leu Asp 
385 390 395 400 

teg acc tac ccg acc gac cag gtc ggc gtt gee ggc get gag cgc ggc 1248 
Ser Thr Tyr Pro Thr Asp Gin Val Gly Val Ala Gly Ala Glu Arg Gly 

405 410 415 

gee tgc ccc acc acc teg ggc gtc ccc teg gat gtt gag gee aac gee 12 96 

Ala Cys Pro Thr Thr Ser Gly Val Pro Ser Asp Val Glu Ala Asn Ala 

420 425 430 

ccc aac tec aac gtc ate ttc tec aac ate cgc ttc ggc ccc ate ggc 1344 
Pro Asn Ser Asn Val lie Phe Ser Asn lie Arg Phe Gly Pro lie Gly 
435 440 445 

tec acc gtc cag ggc ctg ccc age tec ggc ggc acc tec age age teg 13 92 

Ser Thr Val Gin Gly Leu Pro Ser Ser Gly Gly Thr Ser Ser Ser Ser 
450 455 460 

age gee get ccc cag teg acc age acc aag gee teg acc acc acc tea 1440 
Ser Ala Ala Pro Gin Ser Thr Ser Thr Lys Ala Ser Thr Thr Thr Ser 
465 470 475 480 

get gtc cgc acc acc teg act gee acc acc aag acc acc tec teg get 1488 
Ala Val Arg Thr Thr Ser Thr Ala Thr Thr Lys Thr Thr Ser Ser Ala 

485 490 495 

ccc gee cag ggc acc aac act gee aag cat tgg cag caa tgc ggt ggt 1536 
Pro Ala Gin Gly Thr Asn Thr Ala Lys His Trp Gin Gin Cys Gly Gly 

500 505 510 

aac ggc tgg acc ggc ccg acg gtg tgc gag tct ccc tac aag tgc acc 1584 
Asn Gly Trp Thr Gly Pro Thr Val Cys Glu Ser Pro Tyr Lys Cys Thr 
515 520 525 
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aag cag aac gac tgg tac teg cag tgc etc taa 1617 

Lys Gin Asn Asp Trp Tyr Ser Gin Cys Leu 

530 535 



<210> 10 
<211> 538 
<212> PRT 

<213> Thielavia australiensis 
<400> 10 

Met Tyr Ala Lys Phe Ala Thr Leu Ala Ala Leu Val Ala Gly Ala Ser 
15 10 15 

Ala Gin Ala Val Cys Ser Leu Thr Ala Glu Thr His Pro Ser Leu Thr 

20 25 30 

Trp Gin Lys Cys Thr Ala Pro Gly Ser Cys Thr Asn Val Ala Gly Ser 
35 40 45 

lie Thr lie Asp Ala Asn Trp Arg Trp Thr His Gin Thr Ser Ser Ala 
50 55 60 

Thr Asn Cys Tyr Ser Gly Ser Lys Trp Asp Ser Ser lie Cys Thr Thr 
65 70 75 80 

Gly Thr Asp Cys Ala Ser Lys Cys Cys lie Asp Gly Ala Glu Tyr Ser 

85 90 95 

Ser Thr Tyr Gly He Thr Thr Ser Gly Asn Ala Leu Asn Leu Lys Phe 

100 105 110 

Val Thr Lys Gly Gin Tyr Ser Thr Asn He Gly Ser Arg Thr Tyr Leu 
115 120 125 

Met Glu Ser Asp Thr Lys Tyr Gin Met Phe Lys Leu Leu Gly Asn Glu 
130 135 140 

Phe Thr Phe Asp Val Asp Val Ser Asn Leu Gly Cys Gly Leu Asn Gly 
145 150 155 160 

Ala Leu Tyr Phe Val Ser Met Asp Ala Asp Gly Gly Met Ser Lys Tyr 

165 170 175 

Ser Gly Asn Lys Ala Gly Ala Lys Tyr Gly Thr Gly Tyr Cys Asp Ala 

180 185 190 

Gin Cys Pro Arg Asp Leu Lys Phe He Asn Gly Glu Ala Asn Val Glu 
195 200 205 

Gly Trp Glu Ser Ser Thr Asn Asp Ala Asn Ala Gly Ser Gly Lys Tyr 
210 215 220 

Gly Ser Cys Cys Thr Glu Met Asp Val Trp Glu Ala Asn Asn Met Ala 
225 230 235 240 

Thr Ala Phe Thr Pro His Pro Cys Thr Thr lie Gly Gin Thr Arg Cys 

245 250 255 

Glu Gly Asp Thr Cys Gly Gly Thr Tyr Ser Ser Asp Arg Tyr Ala Gly 

260 265 270 
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Val Cys Asp Pro 
275 

Lys Thr Phe Tyr 
290 

Thr Val Val Thr 
305 

lie Lys Arg Phe 



Ser Thr lie Ala 

340 

Asp Ala Gin Lys 
355 

Gly Gly Leu Val 
370 

Val Met Ser Val 
385 

Ser Thr Tyr Pro 



Ala Cys Pro Thr 

420 

Pro Asn Ser Asn 
435 

Ser Thr Val Gin 
450 

Ser Ala Ala Pro 
465 

Ala Val Arg Thr 



Pro Ala Gin Gly 

500 

Asn Gly Trp Thr 
515 

Lys Gin Asn Asp 
530 



Asp Gly Cys Asp 

280 

Gly Lys Gly Met 
295 

Gin Phe Leu Lys 
310 

Tyr Ala Gin Asp 
325 

Gly lie Pro Gly 



Thr Val Phe Gin 

360 

Gin Met Gly Lys 
375 

Trp Asp Asp His 
390 

Thr Asp Gin Val 
405 

Thr Ser Gly Val 



Val lie Phe Ser 

440 

Gly Leu Pro Ser 
455 

Gin Ser Thr Ser 
470 

Thr Ser Thr Ala 
485 

Thr Asn Thr Ala 



Gly Pro Thr Val 

520 

Trp Tyr Ser Gin 
535 



Phe Asn Ser Tyr 



Thr Val Asp Thr 

300 

Asn Ser Ala Gly 
315 

Gly Lys Val lie 
330 

Asn Ser lie Thr 
345 

Asn Thr Asp Asp 



Ala Leu Ala Gly 

380 

Ala Val Asn Met 
395 

Gly Val Ala Gly 
410 

Pro Ser Asp Val 
425 

Asn lie Arg Phe 



Ser Gly Gly Thr 

460 

Thr Lys Ala Ser 
475 

Thr Thr Lys Thr 
490 

Lys His Trp Gin 
505 

Cys Glu Ser Pro 



Cys Leu 



Arg Gin Gly Asn 
285 

Thr Lys Lys lie 



Glu Leu Ser Glu 

320 

Pro Asn Ser Glu 
335 

Lys Ala Tyr Cys 
350 

Phe Thr Ala Lys 
365 

Asp Met Val Leu 



Leu Trp Leu Asp 

400 

Ala Glu Arg Gly 
415 

Glu Ala Asn Ala 
430 

Gly Pro lie Gly 
445 

Ser Ser Ser Ser 



Thr Thr Thr Ser 

480 

Thr Ser Ser Ala 
495 

Gin Cys Gly Gly 
510 

Tyr Lys Cys Thr 
525 



<210> 11 

<211> 1248 

<212> DNA 

<213> Verticillium tenerum 
<220> 

<221> CDS 

<222> (1) . . (1248) 

<223> 
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<400> 11 



atg aag aag get etc ate acc age etc tec ctg ctg gee acg gee atg 48 
Met Lys Lys Ala Leu He Thr Ser Leu Ser Leu Leu Ala Thr Ala Met 
1 5 10 15 

ggc cag cag gee ggt acc etc gag acc gag acg cat ccc aag ctg acc 96 
Gly Gin Gin Ala Gly Thr Leu Glu Thr Glu Thr His Pro Lys Leu Thr 

20 25 30 

tgg cag cgc tgc acc acc tec ggc tgt acc aac gtc aac ggc gag gtc 144 
Trp Gin Arg Cys Thr Thr Ser Gly Cys Thr Asn Val Asn Gly Glu Val 
35 40 45 

gtc ate gac gec aac tgg cgt tgg gee cac gac ate aac ggc tac gag 192 
Val He Asp Ala Asn Trp Arg Trp Ala His Asp He Asn Gly Tyr Glu 
50 55 60 

aac tgc ttc gag ggc aac acc tgg acc ggc acc tgc age ggc gee gac 240 
Asn Cys Phe Glu Gly Asn Thr Trp Thr Gly Thr Cys Ser Gly Ala Asp 
65 70 75 80 

ggc tgc gcg aag aac tgc gee gtc gag gga gec aac tac cag teg acc 2 88 

Gly Cys Ala Lys Asn Cys Ala Val Glu Gly Ala Asn Tyr Gin Ser Thr 

85 90 95 

tac ggt gtc teg acc age ggc aac gee etc tec ctg cgc ttc gtc acc 336 
Tyr Gly Val Ser Thr Ser Gly Asn Ala Leu Ser Leu Arg Phe Val Thr 

100 105 110 

gag cac gag cac ggc gtc aac acc ggt teg cgc acg tac etc atg gag 3 84 

Glu His Glu His Gly Val Asn Thr Gly Ser Arg Thr Tyr Leu Met Glu 
115 120 125 

age gec acc aag tac cag atg ttc acc ctg atg aac aac gag etc gee 432 
Ser Ala Thr Lys Tyr Gin Met Phe Thr Leu Met Asn Asn Glu Leu Ala 
130 135 140 

ttc gac gtc gac ctg tec aag gtc gee tgc ggc atg aac age gee etc 480 
Phe Asp Val Asp Leu Ser Lys Val Ala Cys Gly Met Asn Ser Ala Leu 
145 150 155 160 

tac etc gtc ccc atg aag gee gac ggc ggt etc teg tec gag acc aac 52 8 

Tyr Leu Val Pro Met Lys Ala Asp Gly Gly Leu Ser Ser Glu Thr Asn 

165 170 175 

aac aac gee ggc gee aag tac ggt acc ggt tac tgc gac gee cag tgc 576 
Asn Asn Ala Gly Ala Lys Tyr Gly Thr Gly Tyr Cys Asp Ala Gin Cys 

180 185 190 

get cgc gat etc aag ttc gtc aac ggc aag gee aac ate gag ggc tgg 624 
Ala Arg Asp Leu Lys Phe Val Asn Gly Lys Ala Asn He Glu Gly Trp 
195 200 205 

caa gee tec aag acc gac gag aac tct ggc gtc ggt aac atg ggc tec 672 
Gin Ala Ser Lys Thr Asp Glu Asn Ser Gly Val Gly Asn Met Gly Ser 
210 215 220 

tgc tgt get gag att gac gtt tgg gag tec aac cgc gag tct ttc gee 72 0 

Cys Cys Ala Glu He Asp Val Trp Glu Ser Asn Arg Glu Ser Phe Ala 
225 230 235 240 
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ttc acc cct cac get tgc teg cag aac gag tac cac gtc tgc acc ggc 768 
Phe Thr Pro His Ala Cys Ser Gin Asn Glu Tyr His Val Cys Thr Gly 

245 250 255 

gec aac tgc ggc ggt acc tac teg gac gac cgc ttc gec ggc aag tgc 816 
Ala Asn Cys Gly Gly Thr Tyr Ser Asp Asp Arg Phe Ala Gly Lys Cys 

260 265 270 

gat gec aac ggt tgc gac tac aac ccc ttc cgc gtg ggc aac cag aac 864 
Asp Ala Asn Gly Cys Asp Tyr Asn Pro Phe Arg Val Gly Asn Gin Asn 
275 280 285 

ttc tac ggc ccc ggc atg acc gtc aac acc aac tec aag ttc act gtc 912 
Phe Tyr Gly Pro Gly Met Thr Val Asn Thr Asn Ser Lys Phe Thr Val 
290 295 300 

ate tct cgc ttc egg gag aac gag gee tac cag gtc ttc ate cag aac 960 
lie Ser Arg Phe Arg Glu Asn Glu Ala Tyr Gin Val Phe lie Gin Asn 
305 310 315 320 

99^ cgc acc ate gag gtc ccc cgt ccc acc etc tec ggc ate acc cag 1008 
Gly Arg Thr He Glu Val Pro Arg Pro Thr Leu Ser Gly lie Thr Gin 

325 330 335 

ttc gag gee aag ate acc ccc gag ttc tgc teg acc tac ccc acc gtc 1056 
Phe Glu Ala Lys He Thr Pro Glu Phe Cys Ser Thr Tyr Pro Thr Val 

340 345 350 

ttc 99c gac cgc gac cgc cac ggc gag ate ggc ggc cac acc gee etc 1104 
Phe Gly Asp Arg Asp Arg His Gly Glu He Gly Gly His Thr Ala Leu 
355 360 365 

aac gcg gec etc cgc atg ccc atg gtc etc gtc atg tec ate tgg gec 1152 
Asn Ala Ala Leu Arg Met Pro Met Val Leu Val Met Ser He Trp Ala 
370 375 380 

gac cac tac gee aac atg etc tgg etc gac tec ate tac ccg cca gag 1200 
Asp His Tyr Ala Asn Met Leu Trp Leu Asp Ser He Tyr Pro Pro Glu 
385 390 395 400 

aag agg ggc cag ccc ggc gec cac cgc ggc cgc aga tct aga ggg tga 1248 
Lys Arg Gly Gin Pro Gly Ala His Arg Gly Arg Arg Ser Arg Gly 

405 410 415 



<210> 12 
<211> 415 
<212> PRT 

<213> Verticillium tenerum 
<400> 12 

Met Lys Lys Ala Leu He Thr Ser Leu Ser Leu Leu Ala Thr Ala Met 
15 10 15 

Gly Gin Gin Ala Gly Thr Leu Glu Thr Glu Thr His Pro Lys Leu Thr 

20 25 30 

Trp Gin Arg Cys Thr Thr Ser Gly Cys Thr Asn Val Asn Gly Glu Val 
35 40 45 
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Val lie Asp Ala 
50 

Asn Cys Phe Glu 
65 

Gly Cys Ala Lys 



Tyr Gly Val Ser 

100 

Glu His Glu His 
115 

Ser Ala Thr Lys 
130 

Phe Asp Val Asp 
145 

Tyr Leu Val Pro 



Asn Asn Ala Gly 

180 

Ala Arg Asp Leu 
195 

Gin Ala Ser Lys 
210 

Cys Cys Ala Glu 
225 

Phe Thr Pro His 



Ala Asn Cys Gly 

260 

Asp Ala Asn Gly 
275 

Phe Tyr Gly Pro 
290 

lie Ser Arg Phe 
305 

Gly Arg Thr lie 



Phe Glu Ala Lys 

340 

Phe Gly Asp Arg 
355 

Asn Ala Ala Leu 
370 



Asn Trp Arg Trp 
55 

Gly Asn Thr Trp 
70 

Asn Cys Ala Val 
85 

Thr Ser Gly Asn 



Gly Val Asn Thr 

120 

Tyr Gin Met Phe 
135 

Leu Ser Lys Val 
150 

Met Lys Ala Asp 
165 

Ala Lys Tyr Gly 



Lys Phe Val Asn 

200 

Thr Asp Glu Asn 
215 

lie Asp Val Trp 
230 

Ala Cys Ser Gin 
245 

Gly Thr Tyr Ser 



Cys Asp Tyr Asn 

280 

Gly Met Thr Val 
295 

Arg Glu Asn Glu 
310 

Glu Val Pro Arg 
325 

lie Thr Pro Glu 



Asp Arg His Gly 

360 

Arg Met Pro Met 
375 



Ala His Asp lie 

60 

Thr Gly Thr Cys 
75 

Glu Gly Ala Asn 
90 

Ala Leu Ser Leu 
105 

Gly Ser Arg Thr 



Thr Leu Met Asn 

140 

Ala Cys Gly Met 
155 

Gly Gly Leu Ser 
170 

Thr Gly Tyr Cys 
185 

Gly Lys Ala Asn 



Ser Gly Val Gly 

220 

Glu Ser Asn Arg 
235 

Asn Glu Tyr His 
250 

Asp Asp Arg Phe 
265 

Pro Phe Arg Val 



Asn Thr Asn Ser 

300 

Ala Tyr Gin Val 
315 

Pro Thr Leu Ser 
330 

Phe Cys Ser Thr 
345 

Glu He Gly Gly 



Val Leu Val Met 

380 



Asn Gly Tyr Glu 



Ser Gly Ala Asp 

80 

Tyr Gin Ser Thr 
95 

Arg Phe Val Thr 
110 

Tyr Leu Met Glu 
125 

Asn Glu Leu Ala 



Asn Ser Ala Leu 

160 

Ser Glu Thr Asn 
175 

Asp Ala Gin Cys 
190 

He Glu Gly Trp 
205 

Asn Met Gly Ser 



Glu Ser Phe Ala 

240 

Val Cys Thr Gly 
255 

Ala Gly Lys Cys 
270 

Gly Asn Gin Asn 
285 

Lys Phe Thr Val 



Phe He Gin Asn 

320 

Gly He Thr Gin 
335 

Tyr Pro Thr Val 
350 

His Thr Ala Leu 
365 

Ser He Trp Ala 
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Asp His Tyr Ala Asn Met Leu Trp Leu Asp Ser lie Tyr Pro Pro Glu 
385 390 395 400 

Lys Arg Gly Gin Pro Gly Ala His Arg Gly Arg Arg Ser Arg Gly 

405 410 415 



<210> 13 

<211> 1341 

<212> DNA 

<213> Neotermes castaneus 
<220> 

<221> CDS 

<222> (1) . . (1341) 

<223> 

<400> 13 

gca cga ggg etc get get gca ttg ttc ace ttt gca tgt age gtt ggt 48 

Ala Arg Gly Leu Ala Ala Ala Leu Phe Thr Phe Ala Cys Ser Val Gly 

15 10 15 

ate ggc acc aaa acg gee gag aac cac ccg aag ctg aac tgg cag aac 96 
lie Gly Thr Lys Thr Ala Glu Asn His Pro Lys Leu Asn Trp Gin Asn 

20 25 30 

tgc gee tec aag ggc age tgc tea caa gtg tec ggc gaa gtg aca atg 144 
Cys Ala Ser Lys Gly Ser Cys Ser Gin Val Ser Gly Glu Val Thr Met 
35 40 45 

gac teg aac tgg egg tgg acc cac gat ggc aac ggc aag aac tgc tac 192 
Asp Ser Asn Trp Arg Trp Thr His Asp Gly Asn Gly Lys Asn Cys Tyr 
50 55 60 

gac ggc aac acc tgg ate tec age etc tgc cca gac ggc aag acc tgc 240 
Asp Gly Asn Thr Trp lie Ser Ser Leu Cys Pro Asp Gly Lys Thr Cys 
65 70 75 80 

tct gac aag tgc gtc etc gat ggc gee gaa tac caa gcg acc tac ggc 288 
Ser Asp Lys Cys Val Leu Asp Gly Ala Glu Tyr Gin Ala Thr Tyr Gly 

85 90 95 

ate acc teg aac ggg acc gcg gtc acc etc aag ttc gtc acc cac ggc 336 
lie Thr Ser Asn Gly Thr Ala Val Thr Leu Lys Phe Val Thr His Gly 

100 105 110 

teg tac teg acg aac ate ggc tec cgc ctg tat etc etc aag gac gaa 384 
Ser Tyr Ser Thr Asn lie Gly Ser Arg Leu Tyr Leu Leu Lys Asp Glu 
115 120 125 

aac act tac tac ate ttc aag gtg aac aac aag gaa ttc aca ttc age 432 
Asn Thr Tyr Tyr lie Phe Lys Val Asn Asn Lys Glu Phe Thr Phe Ser 
130 135 140 

gtc gat gtg teg aag etc ccg tgc ggc ctg aac ggt gee etc tac ttc 480 
Val Asp Val Ser Lys Leu Pro Cys Gly Leu Asn Gly Ala Leu Tyr Phe 
145 150 155 160 

gtc teg atg gac gee gac ggt ggc gca gga aag tat tea ggt gcg aag 528 
Val Ser Met Asp Ala Asp Gly Gly Ala Gly Lys Tyr Ser Gly Ala Lys 
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175 



cca ggc gcg aag tac ggc etc ggc tac tgc gat gcg caa tgc ccg age 57 6 

Pro Gly Ala Lys Tyr Gly Leu Gly Tyr Cys Asp Ala Gin Cys Pro Ser 

180 185 190 

gat ctg aag ttc ate aac ggc gaa gcg aac age gat ggc tgg aag ccc 624 
Asp Leu Lys Phe lie Asn Gly Glu Ala Asn Ser Asp Gly Trp Lys Pro 
195 200 205 

cag gcg aac gac aag aat gcg gga aac ggc aaa tac gga teg tgc tgc 672 
Gin Ala Asn Asp Lys Asn Ala Gly Asn Gly Lys Tyr Gly Ser Cys Cys 
210 215 220 

teg gaa atg gac gtt tgg gag gcg aac teg cag gca aca get tac act 720 
Ser Glu Met Asp Val Trp Glu Ala Asn Ser Gin Ala Thr Ala Tyr Thr 
225 230 235 240 

ccg cac gtc tgc aag acc acg ggc cag cag cgc tgc teg ggc aca teg 768 
Pro His Val Cys Lys Thr Thr Gly Gin Gin Arg Cys Ser Gly Thr Ser 

245 250 255 

gaa tgc ggc ggc cag gat ggc gca gcg cgt ttc cag gga ctg tgc gac 816 
Glu Cys Gly Gly Gin Asp Gly Ala Ala Arg Phe Gin Gly Leu Cys Asp 

260 265 270 

gag gac ggt tgc gac ttc aac age tgg cgc cag ggc gac aag acg ttc 8 64 

Glu Asp Gly Cys Asp Phe Asn Ser Trp Arg Gin Gly Asp Lys Thr Phe 
275 280 285 

tac ggc ccg gga ttg act gtt gac acg aag teg ccg ttc aca gtc gtc 912 
Tyr Gly Pro Gly Leu Thr Val Asp Thr Lys Ser Pro Phe Thr Val Val 
290 295 300 

aca caa ttc gtc gga agt ccg gtg aag gaa ate cgc agg aag tac gtc 960 
Thr Gin Phe Val Gly Ser Pro Val Lys Glu lie Arg Arg Lys Tyr Val 
305 310 315 320 

cag aac gga aag gtg att gag aac teg aag aac aag att teg gga att 1008 
Gin Asn Gly Lys Val lie Glu Asn Ser Lys Asn Lys lie Ser Gly lie 

325 330 335 

gac gag acg aac gca gtg agt gat act ttc tgc gat cag caa aag aag 1056 
Asp Glu Thr Asn Ala Val Ser Asp Thr Phe Cys Asp Gin Gin Lys Lys 

340 345 350 

gee ttc ggt gat acg aac gat ttc aag aac aag ggc ggt ttc get aag 1104 
Ala Phe Gly Asp Thr Asn Asp Phe Lys Asn Lys Gly Gly Phe Ala Lys 
355 360 365 

ttg ggt cag gtg ttc gag act ggt cag gtt etc gtg ctg teg ctg tgg 1152 
Leu Gly Gin Val Phe Glu Thr Gly Gin Val Leu Val Leu Ser Leu Trp 
370 375 380 

gat gac cac teg gtt gca atg ctg tgg ttg gac teg gee tac cca acg 1200 
Asp Asp His Ser Val Ala Met Leu Trp Leu Asp Ser Ala Tyr Pro Thr 
385 390 395 400 

aac aag gat aag age age cca ggt gtt gac cgt ggg cct tgc ccg acg 1248 
Asn Lys Asp Lys Ser Ser Pro Gly Val Asp Arg Gly Pro Cys Pro Thr 

405 410 415 
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act tec ggg aag ccg gat gat gtt gaa age caa tct ccc gat gca acc 12 96 

Thr Ser Gly Lys Pro Asp Asp Val Glu Ser Gin Ser Pro Asp Ala Thr 

420 425 430 

gtc att tat ggc aac ate aag ttc ggt gca ctg gac tec act tac 1341 
Val He Tyr Gly Asn He Lys Phe Gly Ala Leu Asp Ser Thr Tyr 
435 440 445 



<210> 14 

<211> 447 

<212> PRT 

<213> Neotermes castaneus 

<400> 14 

Ala Arg Gly Leu Ala Ala Ala Leu Phe Thr Phe Ala Cys Ser Val Gly 
15 10 15 

He Gly Thr Lys Thr Ala Glu Asn His Pro Lys Leu Asn Trp Gin Asn 

20 25 30 

Cys Ala Ser Lys Gly Ser Cys Ser Gin Val Ser Gly Glu Val Thr Met 
35 40 45 

Asp Ser Asn Trp Arg Trp Thr His Asp Gly Asn Gly Lys Asn Cys Tyr 
50 55 60 

Asp Gly Asn Thr Trp He Ser Ser Leu Cys Pro Asp Gly Lys Thr Cys 
65 70 75 80 

Ser Asp Lys Cys Val Leu Asp Gly Ala Glu Tyr Gin Ala Thr Tyr Gly 

85 90 95 

He Thr Ser Asn Gly Thr Ala Val Thr Leu Lys Phe Val Thr His Gly 

100 105 110 

Ser Tyr Ser Thr Asn He Gly Ser Arg Leu Tyr Leu Leu Lys Asp Glu 
115 120 125 

Asn Thr Tyr Tyr He Phe Lys Val Asn Asn Lys Glu Phe Thr Phe Ser 
130 135 140 

Val Asp Val Ser Lys Leu Pro Cys Gly Leu Asn Gly Ala Leu Tyr Phe 
145 150 155 160 

Val Ser Met Asp Ala Asp Gly Gly Ala Gly Lys Tyr Ser Gly Ala Lys 

165 170 175 

Pro Gly Ala Lys Tyr Gly Leu Gly Tyr Cys Asp Ala Gin Cys Pro Ser 

180 185 190 

Asp Leu Lys Phe He Asn Gly Glu Ala Asn Ser Asp Gly Trp Lys Pro 
195 200 205 

Gin Ala Asn Asp Lys Asn Ala Gly Asn Gly Lys Tyr Gly Ser Cys Cys 
210 215 220 

Ser Glu Met Asp Val Trp Glu Ala Asn Ser Gin Ala Thr Ala Tyr Thr 
225 230 235 240 

Pro His Val Cys Lys Thr Thr Gly Gin Gin Arg Cys Ser Gly Thr Ser 
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255 



Glu Cys Gly Gly 

260 

Glu Asp Gly Cys 
275 

Tyr Gly Pro Gly 
290 

Thr Gin Phe Val 
305 

Gin Asn Gly Lys 



Asp Glu Thr Asn 

340 

Ala Phe Gly Asp 
355 

Leu Gly Gin Val 
370 

Asp Asp His Ser 
385 

Asn Lys Asp Lys 



Thr Ser Gly Lys 

420 

Val lie Tyr Gly 
435 



Gin Asp Gly Ala 



Asp Phe Asn Ser 

280 

Leu Thr Val Asp 
295 

Gly Ser Pro Val 
310 

Val lie Glu Asn 
325 

Ala Val Ser Asp 



Thr Asn Asp Phe 

360 

Phe Glu Thr Gly 
375 

Val Ala Met Leu 
390 

Ser Ser Pro Gly 
405 

Pro Asp Asp Val 



Asn lie Lys Phe 

440 



Ala Arg Phe Gin 
265 

Trp Arg Gin Gly 



Thr Lys Ser Pro 

300 

Lys Glu lie Arg 
315 

Ser Lys Asn Lys 
330 

Thr Phe Cys Asp 
345 

Lys Asn Lys Gly 



Gin Val Leu Val 

380 

Trp Leu Asp Ser 
395 

Val Asp Arg Gly 
410 

Glu Ser Gin Ser 
425 

Gly Ala Leu Asp 



Gly Leu Cys Asp 
270 

Asp Lys Thr Phe 
285 

Phe Thr Val Val 



Arg Lys Tyr Val 

320 

lie Ser Gly lie 
335 

Gin Gin Lys Lys 
350 

Gly Phe Ala Lys 
365 

Leu Ser Leu Trp 



Ala Tyr Pro Thr 

400 

Pro Cys Pro Thr 
415 

Pro Asp Ala Thr 
430 

Ser Thr Tyr 
445 



<210> 15 

<211> 1359 

<212> DNA 

<213> Mel ano carpus albomyces 
<220> 

<221> CDS 

<222> (1) . . (1359) 

<223> 

<400> 15 

atg atg atg aag cag tac etc cag tac etc gcg gec gcg ctg ccg etc 48 
Met Met Met Lys Gin Tyr Leu Gin Tyr Leu Ala Ala Ala Leu Pro Leu 
15 10 15 



gtc ggc etc gee gec ggc cag cgc get ggt aac gag acg ccc gag age 96 
Val Gly Leu Ala Ala Gly Gin Arg Ala Gly Asn Glu Thr Pro Glu Ser 

20 25 30 

cac ccc ccg etc ace tgg cag agg tgc acg gee ccg ggc aac tgc cag 144 
His Pro Pro Leu Thr Trp Gin Arg Cys Thr Ala Pro Gly Asn Cys Gin 
35 40 45 
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acc gtg aac gcc gag gtc gta att gac gcc aac tgg cgc tgg ctg cac 192 
Thr Val Asn Ala Glu Val Val He Asp Ala Asn Trp Arg Trp Leu His 
50 55 60 

gac gac aac atg cag aac tgc tac gac ggc aac cag tgg acc aac gcc 240 
Asp Asp Asn Met Gin Asn Cys Tyr Asp Gly Asn Gin Trp Thr Asn Ala 
65 70 75 80 

tgc age acc gcc acc gac tgc get gag aag tgc atg ate gag ggt gcc 288 
Cys Ser Thr Ala Thr Asp Cys Ala Glu Lys Cys Met He Glu Gly Ala 

85 90 95 

ggc gac tac ctg ggc acc tac ggc gcc teg acc age ggc gac gcc ctg 336 
Gly Asp Tyr Leu Gly Thr Tyr Gly Ala Ser Thr Ser Gly Asp Ala Leu 

100 105 110 

acg etc aag ttc gtc acg aag cac gag tac ggc acc aac gtc ggc teg 384 
Thr Leu Lys Phe Val Thr Lys His Glu Tyr Gly Thr Asn Val Gly Ser 
115 120 125 

cgc ttc tac etc atg aac ggc ccg gac aag tac cag atg ttc gac etc 432 
Arg Phe Tyr Leu Met Asn Gly Pro Asp Lys Tyr Gin Met Phe Asp Leu 
130 135 140 

ctg ggc aac gag ctt gcc ttt gac gtc gac etc teg acc gtc gag tgc 480 
Leu Gly Asn Glu Leu Ala Phe Asp Val Asp Leu Ser Thr Val Glu Cys 
145 150 155 160 

ggc ate aac age gcc ctg tac ttc gtc gcc atg gag gag gac ggc ggc 52 8 

Gly He Asn Ser Ala Leu Tyr Phe Val Ala Met Glu Glu Asp Gly Gly 

165 170 175 

atg gcc age tac ccg age aac cag gcc ggc gcc egg tac ggc act ggg 576 
Met Ala Ser Tyr Pro Ser Asn Gin Ala Gly Ala Arg Tyr Gly Thr Gly 

180 185 190 

tac tgc gat gcc caa tgc get cgt gac etc aag ttc gtt ggc ggc aag 624 
Tyr Cys Asp Ala Gin Cys Ala Arg Asp Leu Lys Phe Val Gly Gly Lys 
195 200 205 

gcc aac att gag ggc tgg aag ccg tec acc aac gac ccc aac get ggc 672 
Ala Asn He Glu Gly Trp Lys Pro Ser Thr Asn Asp Pro Asn Ala Gly 
210 215 220 

gtc ggc ccg tac ggc ggc tgc tgc get gag ate gac gtc tgg gag teg 72 0 

Val Gly Pro Tyr Gly Gly Cys Cys Ala Glu He Asp Val Trp Glu Ser 
225 230 235 240 

aac gcc tat gcc ttc get ttc acg ccg cac gcg tgc acg acc aac gag 768 
Asn Ala Tyr Ala Phe Ala Phe Thr Pro His Ala Cys Thr Thr Asn Glu 

245 250 255 

tac cac gtc tgc gag acc acc aac tgc ggt ggc acc tac teg gag gac 816 
Tyr His Val Cys Glu Thr Thr Asn Cys Gly Gly Thr Tyr Ser Glu Asp 

260 265 270 

cgc ttc gcc ggc aag tgc gac gcc aac ggc tgc gac tac aac ccc tac 864 
Arg Phe Ala Gly Lys Cys Asp Ala Asn Gly Cys Asp Tyr Asn Pro Tyr 
275 280 285 



cgc atg ggc aac ccc gac ttc tac ggc aag ggc aag acg etc gac acc 912 
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Arg Met Gly Asn Pro Asp Phe Tyr Gly Lys Gly Lys Thr Leu Asp Thr 
290 295 300 

age cgc aag ttc acc gtc gtc tec cgc ttc gag gag aac aag etc tec 960 
Ser Arg Lys Phe Thr Val Val Ser Arg Phe Glu Glu Asn Lys Leu Ser 
305 310 315 320 

cag tac ttc ate cag gac ggc cgc aag ate gag ate ccg ccg ccg acg 1008 
Gin Tyr Phe lie Gin Asp Gly Arg Lys lie Glu lie Pro Pro Pro Thr 

325 330 335 

tgg gag ggc atg ccc aac age age gag ate acc ccc gag etc tgc tec 1056 
Trp Glu Gly Met Pro Asn Ser Ser Glu lie Thr Pro Glu Leu Cys Ser 

340 345 350 

acc atg ttc gat gtg ttc aac gac cgc aac cgc ttc gag gag gtc ggc 1104 
Thr Met Phe Asp Val Phe Asn Asp Arg Asn Arg Phe Glu Glu Val Gly 
355 360 365 

ggc ttc gag cag ctg aac aac gee etc egg gtt ccc atg gtc etc gtc 1152 
Gly Phe Glu Gin Leu Asn Asn Ala Leu Arg Val Pro Met Val Leu Val 
370 375 380 

atg tec ate tgg gac gac cac tac gee aac atg etc tgg etc gac tec 12 00 

Met Ser lie Trp Asp Asp His Tyr Ala Asn Met Leu Trp Leu Asp Ser 
385 390 395 400 

ate tac ccg ccc gag aag gag ggc cag ccc ggc gee gec cgt ggc gac 1248 
lie Tyr Pro Pro Glu Lys Glu Gly Gin Pro Gly Ala Ala Arg Gly Asp 

405 410 415 

tgc ccc acg gac teg ggt gtc ccc gee gag gtc gag get cag ttc ccc 1296 
Cys Pro Thr Asp Ser Gly Val Pro Ala Glu Val Glu Ala Gin Phe Pro 

420 425 430 

gac gee cag gtc gtc tgg tec aac ate cgc ttc ggc ccc ate ggc teg 1344 
Asp Ala Gin Val Val Trp Ser Asn lie Arg Phe Gly Pro lie Gly Ser 
435 440 445 

acc tac gac ttc taa 1359 
Thr Tyr Asp Phe 
450 



<210> 16 

<211> 452 

<212> PRT 

<213> Melanocarpus albomyces 

<400> 16 



Met Met Met Lys 
1 

Val Gly Leu Ala 

20 

His Pro Pro Leu 
35 

Thr Val Asn Ala 
50 



Gin Tyr Leu Gin 
5 

Ala Gly Gin Arg 



Thr Trp Gin Arg 

40 

Glu Val Val He 
55 



Tyr Leu Ala Ala 
10 

Ala Gly Asn Glu 
25 

Cys Thr Ala Pro 



Asp Ala Asn Trp 

60 



Ala Leu Pro Leu 
15 

Thr Pro Glu Ser 
30 

Gly Asn Cys Gin 
45 

Arg Trp Leu His 
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Asp Asp Asn Met 
65 

Cys Ser Thr Ala 



Gly Asp Tyr Leu 

100 

Thr Leu Lys Phe 
115 

Arg Phe Tyr Leu 
130 

Leu Gly Asn Glu 
145 

Gly lie Asn Ser 



Met Ala Ser Tyr 

180 

Tyr Cys Asp Ala 
195 

Ala Asn lie Glu 
210 

Val Gly Pro Tyr 
225 

Asn Ala Tyr Ala 



Tyr His Val Cys 

260 

Arg Phe Ala Gly 
275 

Arg Met Gly Asn 
290 

Ser Arg Lys Phe 
305 

Gin Tyr Phe lie 



Trp Glu Gly Met 

340 

Thr Met Phe Asp 
355 

Gly Phe Glu Gin 
370 



Gin Asn Cys Tyr 
70 

Thr Asp Cys Ala 
85 

Gly Thr Tyr Gly 



Val Thr Lys His 

120 

Met Asn Gly Pro 
135 

Leu Ala Phe Asp 
150 

Ala Leu Tyr Phe 
165 

Pro Ser Asn Gin 



Gin Cys Ala Arg 

200 

Gly Trp Lys Pro 
215 

Gly Gly Cys Cys 
230 

Phe Ala Phe Thr 
245 

Glu Thr Thr Asn 



Lys Cys Asp Ala 

280 

Pro Asp Phe Tyr 
295 

Thr Val Val Ser 
310 

Gin Asp Gly Arg 
325 

Pro Asn Ser Ser 



Val Phe Asn Asp 

360 

Leu Asn Asn Ala 
375 



Asp Gly Asn Gin 
75 

Glu Lys Cys Met 
90 

Ala Ser Thr Ser 
105 

Glu Tyr Gly Thr 



Asp Lys Tyr Gin 

140 

Val Asp Leu Ser 
155 

Val Ala Met Glu 
170 

Ala Gly Ala Arg 
185 

Asp Leu Lys Phe 



Ser Thr Asn Asp 

220 

Ala Glu lie Asp 
235 

Pro His Ala Cys 
250 

Cys Gly Gly Thr 
265 

Asn Gly Cys Asp 



Gly Lys Gly Lys 

300 

Arg Phe Glu Glu 
315 

Lys lie Glu lie 
330 

Glu lie Thr Pro 
345 

Arg Asn Arg Phe 



Leu Arg Val Pro 

380 



Trp Thr Asn Ala 

80 

He Glu Gly Ala 
95 

Gly Asp Ala Leu 
110 

Asn Val Gly Ser 
125 

Met Phe Asp Leu 



Thr Val Glu Cys 

160 

Glu Asp Gly Gly 
175 

Tyr Gly Thr Gly 
190 

Val Gly Gly Lys 
205 

Pro Asn Ala Gly 



Val Trp Glu Ser 

240 

Thr Thr Asn Glu 
255 

Tyr Ser Glu Asp 
270 

Tyr Asn Pro Tyr 
285 

Thr Leu Asp Thr 



Asn Lys Leu Ser 

320 

Pro Pro Pro Thr 
335 

Glu Leu Cys Ser 
350 

Glu Glu Val Gly 
365 

Met Val Leu Val 



Met Ser He Trp Asp Asp His Tyr Ala Asn Met Leu Trp Leu Asp Ser 
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385 



390 



395 



PCT/DK02/00429 
400 



lie Tyr Pro Pro Glu Lys Glu Gly Gin Pro Gly Ala Ala Arg Gly Asp 

405 410 415 

Cys Pro Thr Asp Ser Gly Val Pro Ala Glu Val Glu Ala Gin Phe Pro 

420 425 430 

Asp Ala Gin Val Val Trp Ser Asn lie Arg Phe Gly Pro lie Gly Ser 
435 440 445 

Thr Tyr Asp Phe 
450 



<210> 17 
<211> 221 
<212> DNA 

<213> Trichothecium roseum 
<220> 

<221> misc_f eature 
<222> (1) . . (221) 

<223> Partial CBH1 encoding sequence 
<400> 17 

tacgcccagt gcgcccgtga cctcaagttc ctcggcggca cttccaacta cgacggctgg 
aagccctcgg acactgacga cagcgccggt gtcggcaacc gcggatcctg ctgcgccgag 
attgacatct gggagtccaa ctcgcacgcc ttcgccttca ccccccacgc ctgcgagaac 
aacgagtacc acatctgcga gaccaccgac tgcggcggca c 



60 
120 
180 
221 



<210> 
<211> 
<212> 
<213> 


18 

239 

DNA 

Humicola nigrescens 










<220> 
<221> 
<222> 
<223> 


misc feature 
(1) . . (239) 

Partial CBH1 encoding sequence 








<400> 


18 












tacggcacgg 


ggtactgcga cgcccaatgc 


gcccgcgatc 


tcaagttcgt 


tggcggcaag 


60 


gccaatgttg 


a 999Ctggaa acagtccacc 


aacgatgcca 


atgccggcgt 


gggtccgatg 


120 


ggcggttgct 


gcgccgaaat tgacgtctgg 


gaatcgaacg 


cccatgcctt 


cgccttcacg 


180 


ccgcacgcgt 


gcgagaacaa caagtaccac 


atctgcgaga 


ctgacggatg 


cggcggcac 


239 


<210> 
<211> 
<212> 


19 

199 

DNA 













<213> Cladorrhinum f oecundissimum 
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<220> 

<221> misc_feature 
<222> (1) . . (199) 

<223> Partial CBH1 encoding sequence 
<400> 19 

tacataaacg gtatcggcaa cgttgagggt tggtcctcct ctaccaacga tcccaacgct 60 
ggtgtcggta accrcggtac ttgctgctcc gagaatggat atctgggagg ccaacaagat 120 
ctcgaccgcc tacactcccc acccctgcac caccatcgac cagcacatgt gcgagggcaa 180 
ctcgtgcggc ggaacctac 199 



<210> 


20 


<211> 


191 


<212> 


DNA 


<213> 


Diplodia gossypina 


<220> 




<221> 


misc_f eature 


<222> 


(1) . . (191) 


<223> 


Partial CBH1 encoding 


<400> 


20 



gttgatccga cggcaaggcc caacgtcgag ggctgggtcc cgtccgagaa cgactccaac 60 

gctggtgtcg gcaaccttgg ctcttgctgt gctgagatgg atatctggga ggccaactcc 12 0 

atctcgaccg cctacacccc ccacagctgc aagacggtcg cccagcactc ttgcactggc 180 

gacgactgcg g 191 

<210> 21 
<211> 232 
<212> DNA 

<213> Myceliophthora thermophila 
<220> 

<221> misc_feature 
<222> (1) . . (232) 

<223> Partial CBH1 encoding sequence 
<400> 21 

gggtactgcg acgcccaatg cgcacgcgac ctcaagttcg tcggcggcaa gggcaacatc 6 0 

gagggctgga agccgtccac caacgatgcc aatgccggtg tcggtcctta tggcgggtgc 12 0 
tgcgctgaga tcgacgtctg ggagtcgaac aagtatgctt tcgctttcac cccgcacggt 180 
tgcgagaacc ctaaatacca cgtctgcgag accaccaact gcggcggcac ct 232 



<210> 22 
<211> 467 
<212> DNA 



32 



WO 03/000941 PCT/DK02/00429 

<213> Rhizomucor pusillus 

<220> 

<221> mi sc_f eature 
<222> (1) . . (467) 

<223> Partial CBH1 encoding sequence 
<400> 22 

tccttcgcct ttacccccca cgcttgctcg cagnaacgag taccacgtct gcaccaccaa 60 
caactgcggc ggcacctact cggacgaccg cttcgccggc aagtgcgacg ccaacggttg 120 
cgactacaac ccgttccgcc tgggcaacca ggacttctac ggcccgggca tgaccgtcga 180 
caccaactcc aagttcaccg tcatctcccg cttcagggag aacgaggcct accaggtctt 240 
catgcagggc ggccggacca tcgaggtccc ggccccgcag ctgtccgggc tcacccagtt 300 
cgacgccaag atcacccccg agttctgcga cacctacccg accgtcttcg acgaccgcaa 360 
ccgccacggc gagatcggcg gccacaccgc cctcaacgcc gccctgcgca tgcccatggt 420 
cctcgtcatg tccatctggg ctgaccacta cgccagctgc tagtgtc 467 

<210> 23 
<211> 534 
<212> DNA 

<213> Meripilus giganteus 
<220> 

<221> mi sc_f eature 
<222> (1) . . (534) 

<223> Partial CBH1 encoding sequence 
<400> 23 

gggagggctc cccgaacgac ccgaacgcgg gaagcggcca gtacggaacg tgctgcaacg 60 
agatggacat ctgggaggcg aaccagaacg gcgcggcggt cacgccgcac gtctgctccg 120 
tcgacggcca gacgcgctgc gagggcacgg actgcggcga cggcgacgag cggtacgacg 18 0 
gcatctgcga caaggacggc tgcgacttca actcgtaccg catgggcgac cagtccttcc 240 
tcggcctcgg caagaccgtc gacacctcga agaagttcac cgtcgtcacc cagttcctca 300 
ccgcggacaa cacgacgtcc ggccagctca cggagatccg ccggctgtac gtgcaggacg 360 
gcaaggtcat cgcgaactcg aagacgaaca tccccggcct cgactcgttc gactccatca 420 
ccgacgactt ctgcaacgcg cagaaggagg tcttcggcga caccaactcg ttcgagaagc 480 
tcggcggcct cgcggagatg ggcaaggcct tccagaaggg catggtcctc gtca 534 

<210> 24 

<211> 563 

<212> DNA 

<213> Exidia glandulosa 
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<220> 

<221> misc_feature 
<222> (l) . . (563) 

<223> Partial CBH1 encoding sequence 
<400> 24 

gccacgtcga gggctggact ccttcmccaa cgatgccaac gccggcattg gcacccacgg 60 

ctcctgctgt tcggagatgg acatctggga ggctaacaat gttgccgctg cgtacacccc 120 

ccatccttgc acaactatcg gccagtcgat ctgctcgggc gattcttgcg gaggaaccta 180 

cagctctgac cgttacgccg gtgtctgcga tccagacggt tgcgatttca acagctaccg 240 

catgggcgac acgggcttct acggcaaggg cctgacagtc gacacgagct ccaagttcac 300 

cgtcgtcacc cagttcctca ccggctccga cggcaacctt tccgagatca agcgcttcta 360 

cgtccagaac ggcaaggtca ttcccaactc gcagtccaag attgccggcg tcagcggcaa 420 

ctccatcacc accgacttct gctccgccca gaagaccgcc ttcggcgaca ccaacgtctt 480 

cgcgcaaaag ggaggtactc gccgggatgg gcgccgccct caaggccggc atggtcctcg 54 0 

tcatgtccat ctgggacgac cac 563 

<210> 25 
<211> 218 
<212> DNA 

<213> Xylaria hypoxylon 
<220> 

<221> misc_f eature 
<222> (1) . . (218) 

<223> Partial CBH1 encoding sequence 
<400> 25 

gacgctcagt gtgcccgtga cttgaagttc gtcggtggca agggcaacgt tgagggatgg 60 
gagccatcca ccaacgacga caacgccggt gttggccctt acggwgcctg ctgtgccgaa 12 0 
atsgatgtst gggagtccaa ctstcactct ttcgctttca cccctcaccc wtgcaccacc 180 
aacgaatacc acgtctgtga gcaggacgag tgtggcgg 218 

<210> 26 

<211> 492 

<212> DNA 

<213> Acremonium sp . 
<220> 

<221> misc_feature 

<222> (1) . . (492) 

<223> Partial CBH1 encoding sequence 

<400> 26 
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yygacggggu 


"CLyCyaCyC 


ccaatigcgcc 


cgtgatctca 


agttcgtcgg 


cggcaaggcc 


60 


o.o.c^ d l cga.gg 


gctggaggcc 


gtccaccaac 


gaegegaacg 


ccggcgtcgg 


cccgatgggc 


120 


yy*- ^y^-tycy 


cyycicictL.Lyci 


t.y uc ugggag 


tccaacgccc 


aeget t ttgc 


cttcacgccg 


180 


/*• /-* ri f*r i- 
v_ctt-.yc.vjL.ycy 


cly ct ci C cl a. C ct cX 


ecctccacacc 


tgegagaect 


ccaactgcgg 


cggtacctac 


240 


LL.LyaL.ydUL 


yc c u eye egg 


cc tc cgegae 


gccaacggcc 


gcgactacaa 


cccgtaccgc 


300 


acgggcaacc 


ccgacttcta 


eggcaaggge 


aagactcttg 


acacctcgcg 


gaagttcacc 


360 


gtcgtcaccc 


gctttcagga 


gaacgacctc 


tegcagtact 


tegtccagga 


cggcccgaag 


420 


atcgagatcc 


cgcccccgac 


ctgggacggc 


ctcccgaaga 


gcagcacata 


cgccgagctg 


480 


tgcgcgaccc 


ag 










492 



<210> 27 

<211> 481 

<212> DNA 

<213> Acremonium sp . 
<220> 

<221> misc_feature 

<222> (1) . . (481) 

<223> Partial CBH1 encoding sequence 

<400> 27 



ggctccgttt 


actcctaccc 


ttgcacggaa 


ateggecaga 


gccgctgcga 


gggegacage 


60 


tgcggcggta 


cctacagcac 


cgaccgctac 


getggegtet 


gcgaccccga 


tggatgegae 


120 


ttcaactcgt 


accgccaggg 


caacaagacc 


ttctatggca 


agggcatgac 


cgtcgacacc 


180 


accaagaaga 


ttaccgtcgt 


cacccagttc 


ctcaccgact 


cgtccggcaa 


cctgtccgag 


240 


ateaageget 


tctacgccca 


gaacggcgtc 


gtcatcccca 


actccgagtc 


caccattgct 


300 


ggcgtccctg 


gcaactcgat 


cacccaggac 


tactgegaca 


agcagaagac 


cgcctttggt 


360 


gacaacaacg 


acttcgacaa 


gaagggtggt 


ctcgcccaga 


tgggtaaggc 


cctggcccaa 


420 


cccatggtcc 


tcgtcatgtc 


cgtctgggat 


gaccatgccg 


teaacatget 


ctgcttcgaa 


480 


a 












481 



<210> 28 

<211> 463 

<212> DNA 

<213> Chaetomium sp, 
<220> 

<221> misc_f eature 

<222> (1) . . (463) 

<22 3> Partial CBH1 encoding sequence 

<400> 28 
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ctccccgtct tcacgccgca cgcgtgcaag aacatcaagt accacgtctg cgagacgtcg 60 

ggatgcggcg gcacctactc ggaggaccgc ttcgcgggcg actgcgacgc caacggttgc 12 0 

gactacaacc cctaccgcat gggcaacacc gacttctacg gcaagggcat gacggtcgac 180 

accagcaaga agttcaccgt cgtgacccaa ttccaggaga acaagctcac ccagttcttc 240 

gtccagaacg gcaagaagat cgagatccct ggccccaagt gggacggcat tgagggcgac 300 

agcgccgcca tcacgcccca gctgtgcact tccatgttca aggccttcga cgaccgcgat 360 

cgcttctcgg aggtcggcgg cttcacccag atcaaccagg ccctctcggt gcccatggtg 420 

ctcgtcatgt ccatctggga cgaccactac gccaacatgc ttg 463 

<210> 29 

<211> 513 

<212> DNA 

<213> Chaetomidium pingtungium 
<220> 

<221> misc_f eature 

<222> (1) . . (513) 

<223> Partial CBH1 encoding sequence 

<400> 29 



gaagggtggc 


agccctcctc 


caacgatgcc 


aatgcgggta 


ccggcaacca 


cgggtcctgc 


60 


tgcgcggaga 


tggatatctg 


ggaggccaac 


agcatctcca 


cggccttcac 


cccccatccg 


120 


tgcgacacgc 


ccggccaggt 


gatgtgcacc 


ggtgatgcct 


gcggtggcac 


ctacagctcc 


180 


gaccgctacg 


gcggcacctg 


cgaccccgac 


ggatgtgatt 


tcaactcctt 


ccgccagggc 


240 


aacaagacct 


tctacggccc 


tggcatgacc 


gtcgacacca 


agagcaagtt 


taccgtcgtc 


300 


acccagttca 


tcaccgacga 


cggcacctcc 


agcggcaccc 


tcaaggagat 


caagcgcttc 


360 


tacgtgcaga 


acggcaaggt 


gatccccaac 


tcggagtcga 


cctggaccgg 


cgtcagcggc 


420 


aactccatca 


ccaccgagta 


ctgcaccgcc 


cagaagagcc 


tgttccagga 


ccagaacgtc 


480 


ttcgaaaagc 


acggtggcct 


cgagggcatg 


ggt 






513 



<210> 30 

<211> 579 

<212> DNA 

<213> Myceliophthora thermophila 
<220> 

<221> misc_feature 

<222> (1) . . (579) 

<223> Partial CBH1 encoding sequence 

<400> 30 

gagatggata tttgggaggc caacaacatg gccgccgcct tcactcccca cccttgcacc 60 



36 



WO 03/000941 PCT/DK02/00429 



gtgatcggcc 


agtcgcgctg 


cgagggcgac 


tcgtgcggcg 


gtacctacag 


caccgaccgc 


120 


tatgccggca 


tctgcgaccc 


cgacggatgc 


gacttcaact 


cgtaccgcca 


gggcaacaag 


180 


accttctacg 


gcaagggcat 


gacggtcgac 


acgaccaaga 


agatcacggt 


cgtcacccag 


240 


ttcctcaaga 


actcggccgg 


cgagctctcc 


gagatcaagc 


ggttctacgt 


ccagaacggc 


300 


aaggtcatcc 


ccaactccga 


gtccaccatc 


ccgggcgtcg 


agggcaactc 


cattacccag 


360 


gactggtgcg 


accgccagaa 


ggccgctttc 


qqcqacqtqa 


ccgactttca 


crcracaacrqac 


420 


ggcatggtcc 


agatgggcaa 


ggccctcgcg 


ggcccaatgg 


tcctcgtcat 


gtccatctgg 


480 


gacgaccacg 


ccgtcaacat 


gctctggctc 


gaaatcacta 


gtgcggccgc 


tgcaggtcga 


540 


ccatatggga 


gagctccacg 


cgttggatgc 


atagcttga 






579 



<210> 31 

<211> 514 

<212> DNA 

<213> Myceliophthora hinnulea 
<220> 

<221> misc_f eature 

<222> (1) . . (514) 

<223> Partial CBH1 encoding sequence 

<400> 31 



cgtgagggct 


gggagagctc 


gaccaacgat 


gccaacgccg 


gcacgggcag 


gtacggcagc 


60 


tgctgctccg 


agatggacgt 


ctgggaggcc 


aacaacatgg 


ccaccgcctt 


caccccccat 


120 


ccttgcacca 


tcatcggcca 


gtcgcgctgc 


gagggcgaga 


cgtgcggcgg 


cacctacagc 


180 


tcggaccgct 


acgccggcgt 


ctgcgacccc 


gacggctgcg 


acttcaactc 


gtaccgccag 


240 


ggcaacaaga 


ccttctacgg 


caagggcatg 


acggtcgaca 


cgaccaagaa 


gctcacggtc 


300 


gtcacgcagt 


tcctcaagaa 


ctcggccggc 


gagctgtccg 


agatcaagcg 


gttctacgtc 


360 


caggacggca 


aggtgatccc 


caactccgag 


tccaccatcc 


ccggcgtcga 


gggcaactcg 


420 


atcacgcagg 


actggtgcga 


ccgccagaag 


gccgccttcg 


gcgacgtcac 


cgacttccag 


480 


gacaagggcg 


gcatggtcca 


gatggcaagg 


cgct 






514 



<210> 32 

<211> 477 

<212> DNA 

<213> Sporotrichum pruinosum 
<220> 

<221> misc_f eature 

<222> (1) . . (477) 

<223> Partial CBH1 encoding sequence 

<400> 32 
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cacccttgcc 


gcaccacgaa 


cgacggtggc 


taccaacgct 


gccaaggacg 


tgactgcaac 


60 


cagcctcgtt 


atgagggtct 


ttgcgatcct 


gacggttgcg 


actacaaccc 


tttccgtatg 


120 


ggtaaccgcg 


aattctacgg 


ccctggaaag 


accgtcgaca 


ccaacaggaa 


gttcactgtt 


180 


gtgacccaat 


tcattaccga 


caacaactct 


gacactggta 


ccctcgtcga 


catccgccgc 


240 


ctctacgtcc 


aagacggccg 


tgtcattgcc 


aaccctccca 


ccaacttccc 


cggtctcatg 


300 


cccgcccacg 


actccatcac 


ttagcaattc 


tgtgacgacg 


ccaagcgagc 


attcgaggac 


360 


aacgacagct 


ttggcaggaa 


cggtggtctt 


gctcacatgg 


gtcgctccct 


tgccaagggc 


420 


catgtcctcg 


ccctttccat 


ttggaatgat 


cacactgcca 


acatgctctg 


gctcgaa 


All 



<210> 33 
<211> 500 
<212> DNA 

<213> Thielavia cf. microspora 
<220> 

<221> misc_f eature 
<222> (1) . . (500) 

<223> Partial CBH1 encoding sequence 
<400> 33 

gagatagatg tctgggagtc caactcgcac tcgtttgcct tcacgccgca cgcgtgcaag 60 
aacaacaagt accacgtctg ccagacgacc gggtgcggcg gcacctactc ggaggaccgc 12 0 
ttcgccggcg actgcgacgc caacggctgc gactacaacc cctaccgcat gggcaacacc 18 0 
gacttttacg gcaagggcaa gacggtcgac acgagcaaga agtttaccat ggtgacccag 24 0 
ttccaaaaga acaagctcgt ccagttcttt gtccaggacg gcaagaagat cgacatcccc 300 
ggccccaagt gggacggcct gccgcagggc agcgccgcca tcaccccgga gctgtgcacc 3 60 
ttcatgttca aggccttcaa cgaccgcgac cgcttctcag aggttggcgg cttcgaccag 420 
atcaacacgg ccctctcggt gccaatggtg ctcgtcatgt ccatctggga tgatcactac 480 
gccaacatgc tctggcttga 500 

<210> 34 

<211> 470 

<212> DNA 

<213> Scytalidium sp . 
<220> 

<221> misc_f eature 

<222> (1) . . (470) 

<22 3> Partial CBH1 encoding sequence 

<400> 34 

cgttnggccc gcgtcgcatg ctcccgcccg catggcccgc gggatttcca gccagagcat 60 
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gttggagtgg tggtcatccc agatggacat gacaaggacc atgggaatgg tgagggcctc 12 0 

gttcagagca tcgaagccac cggtctcggc gaagcggttg cggtcatcga agacgcggaa 180 

ctgagcatcg cagagctcag gggtgatgtc ggcgctgttc gggaggccgg gccaggtcgg 240 

agggggcacc tcgatcttgc ggccgtcctg gacgaagaac tgagagagcc tgttacgctc 3 00 

gaagcgggag acaacggtga acttgcggtt ggtgtcgacg gtcttgccct tgccatagaa 3 60 

gtccttgttg cccatgcggt aggggttgta gtcgcagccg ttggcatcgc agtagccggc 42 0 

gaagcggtca tccgagtagg taccaccgca gttgttggtc tccagatgtg 470 

<210> 35 
<211> 491 
<212> DNA 

<213> Scytalidium sp. 
<220> 

<221> misc_feature 
<222> (1) . . (491) 

<223> Partial CBH1 encoding sequence 
<400> 35 

gaaatcgacg tctgggagtc gaacgcctat gcctatgcct taccccgcac gcttgcggca 60 
gccagaaccg ctaccacgtc tgcgagacca acaactgcgg tggtacctac tcggatgacc 120 
gcttcgccgg ttactgcgat gccaacggct gcgactacaa cccgtaccgc atgggcaaca 180 
gggacttcta cggcaagggc ctgcaggtcg acaccagccg gaagttcacc gtcgtgagcc 24 0 
gcttcgagcg caacaagctc acccagttct tcgttcagga cggccgcaag atcgagcccc 3 00 
ctgcgccgac ctgggacggc atcccgaaga gcgccgacat cacccccgag ttctgcagcg 360 
cccagttcaa ggtcttcgac gaccgtgacc gcttcgcgga gactggcggc ttcgatgccc 42 0 
tgaacgatgc tctcagcatt cccatggtcc ttgtcatgtc catctgggat taccactact 480 
ccaacataat c 491 

<210> 36 
<211> 221 
<212> DNA 

<213> Trichophaea saccata 
<220> 

<221> misc_f eature 
<222> (1) . . (221) 

<223> Partial CBH1 encoding sequence 
<400> 36 

tgcgactccc agtgtccccg cgatctcaag ttcatcaatg gacagggcaa cgttgaaggc 60 
tggaagccat cctcaaatga tgccaacgca ggcgtcgggg gacacggttc ctgctgcgca 12 0 
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gagatggatg tttgggaggc caattccatc tccgcggccg taacaccgca ctcgtgctcc 180 
acaaccagcc agacgatgtg caacggcgac tcctgcggcg g 221 

<210> 37 

<211> 1365 

<212> DNA 

<213> Diplodia gossypina 
<220> 

<221> CDS 

<222> (1) . . (1365) 

<223> 

<400> 37 

atg ctt acc cag gca gtt etc get act etc gec ace ctg gee gee age 48 

Met Leu Thr Gin Ala Val Leu Ala Thr Leu Ala Thr Leu Ala Ala Ser 
15 10 15 

cag cag gtc ggc acc cag aag gag gag gtc cac ccc tec atg acc tgg 96 
Gin Gin Val Gly Thr Gin Lys Glu Glu Val His Pro Ser Met Thr Trp 

20 25 30 

cag act tgc acc age age ggc tgc acc acc aac cag ggc tec ate gtc 144 
Gin Thr Cys Thr Ser Ser Gly Cys Thr Thr Asn Gin Gly Ser lie Val 
35 40 45 

gtt gac gee aac tgg cgc tgg gtc cac aac acc gag ggc tac acc aac 192 
Val Asp Ala Asn Trp Arg Trp Val His Asn Thr Glu Gly Tyr Thr Asn 
50 55 60 

tgc tac acg ggc aac acc tgg aac gee gac tac tgc acc gac aac acc 24 0 

Cys Tyr Thr Gly Asn Thr Trp Asn Ala Asp Tyr Cys Thr Asp Asn Thr 
65 70 75 80 

gag tgc gee tec aac tgc gec etc gac ggc gee gac tac tct ggc acc 288 
Glu Cys Ala Ser Asn Cys Ala Leu Asp Gly Ala Asp Tyr Ser Gly Thr 

85 90 95 

tac ggc get acc acc tec ggc gac teg ctg cgc ctg aac ttc ate acc 336 
Tyr Gly Ala Thr Thr Ser Gly Asp Ser Leu Arg Leu Asn Phe lie Thr 

100 105 110 

aac ggc cag cag aag aac att ggc tec cgc atg tac etc atg cag gat 384 
Asn Gly Gin Gin Lys Asn lie Gly Ser Arg Met Tyr Leu Met Gin Asp 
115 120 125 

gac gag acc tac gee gtc cac aag etc etc aac aag gag ttc acc ttc 432 
Asp Glu Thr Tyr Ala Val His Lys Leu Leu Asn Lys Glu Phe Thr Phe 
130 135 140 

gac gtc gac acc tec aag ctg cct tgc ggc etc aac ggt gee gtc tac 480 
Asp Val Asp Thr Ser Lys Leu Pro Cys Gly Leu Asn Gly Ala Val Tyr 
145 150 155 160 

ttc gtc tec atg gac get gac ggt ggc atg gee aag ttc ccc gac aac 528 
Phe Val Ser Met Asp Ala Asp Gly Gly Met Ala Lys Phe Pro Asp Asn 

165 170 175 
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aag gcc ggc gcc aag tac ggt acc ggt tac tgc gac teg cag tgc ccc 576 
Lys Ala Gly Ala Lys Tyr Gly Thr Gly Tyr Cys Asp Ser Gin Cys Pro 

180 185 190 

cgt gac etc aag ttc ate gac ggc aag gcc aac gtc gag ggc tgg gtc 624 
Arg Asp Leu Lys Phe lie Asp Gly Lys Ala Asn Val Glu Gly Trp Val 
195 200 205 

ccg tec gag aac gac tec aac get ggt gtc ggc aac ctt ggc tct tgc 672 
Pro Ser Glu Asn Asp Ser Asn Ala Gly Val Gly Asn Leu Gly Ser Cys 
210 215 220 

tgt get gag atg gat ate tgg gag gcc aac tec ate teg acc gcc tac 720 
Cys Ala Glu Met Asp lie Trp Glu Ala Asn Ser lie Ser Thr Ala Tyr 
225 230 235 240 

acc ccc cac age tgc aag acg gtc gcc cag cac tct tgc act ggc gac 768 
Thr Pro His Ser Cys Lys Thr Val Ala Gin His Ser Cys Thr Gly Asp 

245 250 255 

gac tgc ggt ggc acc tac tec gcg acc cgc tac gcc ggc gac tgc gac 816 
Asp Cys Gly Gly Thr Tyr Ser Ala Thr Arg Tyr Ala Gly Asp Cys Asp 

260 265 270 

ccc gac gga tgc gac ttc aac teg tac cgc cag ggc gtc aag gac ttc 864 
Pro Asp Gly Cys Asp Phe Asn Ser Tyr Arg Gin Gly Val Lys Asp Phe 
275 280 285 

tac ggg ccc ggc atg acc gtc gac age aac teg gtc gtc acc gtc gtc 912 
Tyr Gly Pro Gly Met Thr Val Asp Ser Asn Ser Val Val Thr Val Val 
290 295 300 

acg cag ttc ate acc aac gac ggc acc gcg tec ggc acc etc tec gag 960 
Thr Gin Phe lie Thr Asn Asp Gly Thr Ala Ser Gly Thr Leu Ser Glu 
305 310 315 320 

ate aag cgc ttc tac gtc cag aac ggc aag gtt ate ccc aac tec gag 1008 
lie Lys Arg Phe Tyr Val Gin Asn Gly Lys Val lie Pro Asn Ser Glu 

325 330 335 

tec acc ate gcc ggc gtc age ggc aac age ate acc tec gcg tac tgc 1056 
Ser Thr lie Ala Gly Val Ser Gly Asn Ser lie Thr Ser Ala Tyr Cys 

340 345 350 

gac gcg cag aag gag gtc ttc ggc gac aac acg teg ttc cag gac cag 1104 
Asp Ala Gin Lys Glu Val Phe Gly Asp Asn Thr Ser Phe Gin Asp Gin 
355 360 365 

ggc 99 c ttg gcc age atg age cag gcc etc aac gcc ggc atg gtc etc 1152 
Gly Gly Leu Ala Ser Met Ser Gin Ala Leu Asn Ala Gly Met Val Leu 
370 375 380 

gtc atg tec ate tgg gac gac cac cac age aac atg etc tgg etc gac 1200 
Val Met Ser lie Trp Asp Asp His His Ser Asn Met Leu Trp Leu Asp 
385 390 395 400 

tec gac tac ccc gtc gac gcc gac ccg age cag ccc ggc ate tec cgc 1248 
Ser Asp Tyr Pro Val Asp Ala Asp Pro Ser Gin Pro Gly lie Ser Arg 

405 410 415 

ggt act tgc ccc acc acc tct ggt gtc ccc age gag gtt gag gag age 1296 
Gly Thr Cys Pro Thr Thr Ser Gly Val Pro Ser Glu Val Glu Glu Ser 
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430 



gcc get age gec tac gtc gtc tac teg aac att aag gtt ggt gac ctt 1344 
Ala Ala Ser Ala Tyr Val Val Tyr Ser Asn He Lys Val Gly Asp Leu 
435 440 445 

aac age act ttc tct get tag 1365 
Asn Ser Thr Phe Ser Ala 
450 



<210> 38 

<211> 454 

<212> PRT 

<213> Diplodia gossypina 

<400> 38 

Met Leu Thr Gin Ala Val Leu Ala Thr Leu Ala Thr Leu Ala Ala Ser 
1 5 10 15 

Gin Gin Val Gly Thr Gin Lys Glu Glu Val His Pro Ser Met Thr Trp 

20 25 30 

Gin Thr Cys Thr Ser Ser Gly Cys Thr Thr Asn Gin Gly Ser He Val 
35 40 45 

Val Asp Ala Asn Trp Arg Trp Val His Asn Thr Glu Gly Tyr Thr Asn 
50 55 60 

Cys Tyr Thr Gly Asn Thr Trp Asn Ala Asp Tyr Cys Thr Asp Asn Thr 
65 70 75 80 

Glu Cys Ala Ser Asn Cys Ala Leu Asp Gly Ala Asp Tyr Ser Gly Thr 

85 90 95 

Tyr Gly Ala Thr Thr Ser Gly Asp Ser Leu Arg Leu Asn Phe He Thr 

100 105 110 

Asn Gly Gin Gin Lys Asn He Gly Ser Arg Met Tyr Leu Met Gin Asp 
115 120 125 

Asp Glu Thr Tyr Ala Val His Lys Leu Leu Asn Lys Glu Phe Thr Phe 
130 135 140 

Asp Val Asp Thr Ser Lys Leu Pro Cys Gly Leu Asn Gly Ala Val Tyr 
145 150 155 160 

Phe Val Ser Met Asp Ala Asp Gly Gly Met Ala Lys Phe Pro Asp Asn 

165 170 175 

Lys Ala Gly Ala Lys Tyr Gly Thr Gly Tyr Cys Asp Ser Gin Cys Pro 

180 185 190 

Arg Asp Leu Lys Phe He Asp Gly Lys Ala Asn Val Glu Gly Trp Val 
195 200 205 

Pro Ser Glu Asn Asp Ser Asn Ala Gly Val Gly Asn Leu Gly Ser Cys 
210 215 220 

Cys Ala Glu Met Asp He Trp Glu Ala Asn Ser He Ser Thr Ala Tyr 
225 230 235 240 
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Thr Pro His Ser Cys Lys Thr Val Ala Gin His Ser Cys Thr Gly Asp 

245 250 255 

Asp Cys Gly Gly Thr Tyr Ser Ala Thr Arg Tyr Ala Gly Asp Cys Asp 

260 265 270 

Pro Asp Gly Cys Asp Phe Asn Ser Tyr Arg Gin Gly Val Lys Asp Phe 
275 280 285 

Tyr Gly Pro Gly Met Thr Val Asp Ser Asn Ser Val Val Thr Val Val 
290 295 300 

Thr Gin Phe lie Thr Asn Asp Gly Thr Ala Ser Gly Thr Leu Ser Glu 
305 310 315 320 

He Lys Arg Phe Tyr Val Gin Asn Gly Lys Val He Pro Asn Ser Glu 

325 330 335 

Ser Thr He Ala Gly Val Ser Gly Asn Ser He Thr Ser Ala Tyr Cys 

340 345 350 

Asp Ala Gin Lys Glu Val Phe Gly Asp Asn Thr Ser Phe Gin Asp Gin 
355 360 365 

Gly Gly Leu Ala Ser Met Ser Gin Ala Leu Asn Ala Gly Met Val Leu 
370 375 380 

Val Met Ser He Trp Asp Asp His His Ser Asn Met Leu Trp Leu Asp 
385 390 395 400 

Ser Asp Tyr Pro Val Asp Ala Asp Pro Ser Gin Pro Gly He Ser Arg 

405 410 415 

Gly Thr Cys Pro Thr Thr Ser Gly Val Pro Ser Glu Val Glu Glu Ser 

420 425 430 

Ala Ala Ser Ala Tyr Val Val Tyr Ser Asn He Lys Val Gly Asp Leu 
435 440 445 

Asn Ser Thr Phe Ser Ala 
450 



<210> 39 

<211> 1377 

<212> DNA 

<213> Trichophaea saccata 
<220> 

<221> CDS 

<222> (1) . . (1377) 

<223> 

<400> 39 

atg caa cgc ctt etc gtt ctt etc acc tec ctt etc get ttc ace tat 48 
Met Gin Arg Leu Leu Val Leu Leu Thr Ser Leu Leu Ala Phe Thr Tyr 
15 10 15 

ggc caa caa gtt ggc act caa cag gee gaa gtc cac ccc teg atg acc 96 
Gly Gin Gin Val Gly Thr Gin Gin Ala Glu Val His Pro Ser Met Thr 
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tgg cag cag tgt aca aag tec ggc ggc tgc acc acg aag aac ggc aaa 144 
Trp Gin Gin Cys Thr Lys Ser Gly Gly Cys Thr Thr Lys Asn Gly Lys 
35 40 45 

gtc gtg ate gat gec aac tgg cgt tgg gta cac aat gtc ggc ggc tac 192 
Val Val lie Asp Ala Asn Trp Arg Trp Val His Asn Val Gly Gly Tyr 
50 55 60 

acc aat tgc tac act ggc aac acc tgg gac agt teg ctt tgt ccc gac 240 
Thr Asn Cys Tyr Thr Gly Asn Thr Trp Asp Ser Ser Leu Cys Pro Asp 
65 70 75 80 

gat gtc acc tgc gcg aag aat tgc get ctt gat ggc gcg gac tac tct 288 
Asp Val Thr Cys Ala Lys Asn Cys Ala Leu Asp Gly Ala Asp Tyr Ser 

85 90 95 

ggc act tat gga gtt act gcg ggc ggg aat teg ttg aag etc acc ttc 336 
Gly Thr Tyr Gly Val Thr Ala Gly Gly Asn Ser Leu Lys Leu Thr Phe 

100 105 110 

gtc act aag ggt caa tac tct act aat gtg ggc teg cga ttg tat atg 384 
Val Thr Lys Gly Gin Tyr Ser Thr Asn Val Gly Ser Arg Leu Tyr Met 
115 120 125 

etc gee gac gac age aca tac cag atg tat aat ctg ctg aac cag gag 432 
Leu Ala Asp Asp Ser Thr Tyr Gin Met Tyr Asn Leu Leu Asn Gin Glu 
130 135 140 

ttt acg ttc gac gtt gat gtt tct aat ctt cct tgt ggg ctt aac ggg 480 
Phe Thr Phe Asp Val Asp Val Ser Asn Leu Pro Cys Gly Leu Asn Gly 
145 150 155 160 

get ctg tat ttc gtc teg atg gat aag gat ggt ggg atg teg aag tac 528 
Ala Leu Tyr Phe Val Ser Met Asp Lys Asp Gly Gly Met Ser Lys Tyr 

165 170 175 

tct ggg aac aag get ggt gec aag tat gga act ggg tac tgc gac tec 576 
Ser Gly Asn Lys Ala Gly Ala Lys Tyr Gly Thr Gly Tyr Cys Asp Ser 

180 185 190 

cag tgt ccc cgc gat etc aag ttc ate aat gga cag ggc aac gtt gaa 624 
Gin Cys Pro Arg Asp Leu Lys Phe lie Asn Gly Gin Gly Asn Val Glu 
195 200 205 

ggc tgg aag cca tec tea aat gat gee aac gca ggc gtc ggg gga cac 672 
Gly Trp Lys Pro Ser Ser Asn Asp Ala Asn Ala Gly Val Gly Gly His 
210 215 220 

ggt tec tgc tgc gca gag atg gat gtt tgg gag gee aat tec ate tec 72 0 

Gly Ser Cys Cys Ala Glu Met Asp Val Trp Glu Ala Asn Ser He Ser 

225 230 235 240 

gcg gee gta aca ccg cac teg tgc tec aca acc age cag acg atg tgc 768 
Ala Ala Val Thr Pro His Ser Cys Ser Thr Thr Ser Gin Thr Met Cys 

245 250 255 

aac ggc gac tec tgc ggc ggt acc tac tea gec aca cga tac get ggt 816 
Asn Gly Asp Ser Cys Gly Gly Thr Tyr Ser Ala Thr Arg Tyr Ala Gly 

260 265 270 
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gtc tgc gat ccc gat ggc tgc gac ttc aac tec tac cgt atg ggc gac 864 
Val Cys Asp Pro Asp Gly Cys Asp Phe Asn Ser Tyr Arg Met Gly Asp 
275 280 285 

acg acc ttc tac ggc aag gga aag acg gtc gat acc age tec aag ttc 912 
Thr Thr Phe Tyr Gly Lys Gly Lys Thr Val Asp Thr Ser Ser Lys Phe 
290 295 300 

acg gtc gtg acc cag ttc ate acc gac act gga acc gec tec ggc teg 960 
Thr Val Val Thr Gin Phe lie Thr Asp Thr Gly Thr Ala Ser Gly Ser 
305 310 315 320 

etc acg gag ate cgc cgc ttc tac gtc cag aac gga aag ttg ate ccc 1008 
Leu Thr Glu lie Arg Arg Phe Tyr Val Gin Asn Gly Lys Leu lie Pro 

325 330 335 

aac tec cag teg aag ate teg ggc gtc act ggc aac tec ate acc tct 1056 
Asn Ser Gin Ser Lys lie Ser Gly Val Thr Gly Asn Ser lie Thr Ser 

340 345 350 

get ttc tgc gac get cag aag gcg get ttc ggc gat aac tac acg ttc 1104 
Ala Phe Cys Asp Ala Gin Lys Ala Ala Phe Gly Asp Asn Tyr Thr Phe 
355 360 365 

aag gac aag ggc ggc ttc gca tec atg act act get atg aag aac gga 1152 
Lys Asp Lys Gly Gly Phe Ala Ser Met Thr Thr Ala Met Lys Asn Gly 
370 375 380 

atg gtc ctg gtt atg agt ctt tgg gat gac cac tac gec aat atg etc 12 00 

Met Val Leu Val Met Ser Leu Trp Asp Asp His Tyr Ala Asn Met Leu 
385 390 395 400 

tgg ctt gat age gac tat ccc act aac gcg gac tec tec aag ccg ggt 1248 
Trp Leu Asp Ser Asp Tyr Pro Thr Asn Ala Asp Ser Ser Lys Pro Gly 

405 410 415 

gtt get cgt ggc acc tgc ccg act tct tec ggc gtg ccc teg gat gtc 12 96 

Val Ala Arg Gly Thr Cys Pro Thr Ser Ser Gly Val Pro Ser Asp Val 

420 425 430 

gag act aac aat gca age get teg gtc acg tac tec aac att aga ttt 1344 
Glu Thr Asn Asn Ala Ser Ala Ser Val Thr Tyr Ser Asn lie Arg Phe 
435 440 445 

gga gat etc aat tec act tac acc gec cag taa 1377 
Gly Asp Leu Asn Ser Thr Tyr Thr Ala Gin 
450 455 



<210> 40 

<211> 458 

<212> PRT 

<213> Trichophaea saccata 

<400> 40 

Met Gin Arg Leu Leu Val Leu Leu 
1 5 

Gly Gin Gin Val Gly Thr Gin Gin 

20 



Thr Ser Leu Leu Ala Phe Thr Tyr 

10 15 

Ala Glu Val His Pro Ser Met Thr 
25 30 
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Trp Gin Gin Cys 
35 

Val Val He Asp 
50 

Thr Asn Cys Tyr 
65 

Asp Val Thr Cys 



Gly Thr Tyr Gly 

100 

Val Thr Lys Gly 
115 

Leu Ala Asp Asp 
130 

Phe Thr Phe Asp 
145 

Ala Leu Tyr Phe 



Ser Gly Asn Lys 

180 

Gin Cys Pro Arg 
195 

Gly Trp Lys Pro 
210 

Gly Ser Cys Cys 
225 

Ala Ala Val Thr 



Asn Gly Asp Ser 

260 

Val Cys Asp Pro 
275 

Thr Thr Phe Tyr 
290 

Thr Val Val Thr 
305 

Leu Thr Glu He 



Asn Ser Gin Ser 

340 

Ala Phe Cys Asp 
355 



Thr Lys Ser Gly 

40 

Ala Asn Trp Arg 
55 

Thr Gly Asn Thr 
70 

Ala Lys Asn Cys 
85 

Val Thr Ala Gly 



Gin Tyr Ser Thr 

120 

Ser Thr Tyr Gin 
135 

Val Asp Val Ser 
150 

Val Ser Met Asp 
165 

Ala Gly Ala Lys 



Asp Leu Lys Phe 

200 

Ser Ser Asn Asp 
215 

Ala Glu Met Asp 
230 

Pro His Ser Cys 
245 

Cys Gly Gly Thr 



Asp Gly Cys Asp 

280 

Gly Lys Gly Lys 
295 

Gin Phe He Thr 
310 

Arg Arg Phe Tyr 
325 

Lys He Ser Gly 



Ala Gin Lys Ala 

360 



Gly Cys Thr Thr 



Trp Val His Asn 

60 

Trp Asp Ser Ser 
75 

Ala Leu Asp Gly 
90 

Gly Asn Ser Leu 
105 

Asn Val Gly Ser 



Met Tyr Asn Leu 

140 

Asn Leu Pro Cys 
155 

Lys Asp Gly Gly 
170 

Tyr Gly Thr Gly 
185 

He Asn Gly Gin 



Ala Asn Ala Gly 

220 

Val Trp Glu Ala 
235 

Ser Thr Thr Ser 
250 

Tyr Ser Ala Thr 
265 

Phe Asn Ser Tyr 



Thr Val Asp Thr 

300 

Asp Thr Gly Thr 
315 

Val Gin Asn Gly 
330 

Val Thr Gly Asn 
345 

Ala Phe Gly Asp 



Lys Asn Gly Lys 
45 

Val Gly Gly Tyr 



Leu Cys Pro Asp 

80 

Ala Asp Tyr Ser 
95 

Lys Leu Thr Phe 
110 

Arg Leu Tyr Met 
125 

Leu Asn Gin Glu 



Gly Leu Asn Gly 

160 

Met Ser Lys Tyr 
175 

Tyr Cys Asp Ser 
190 

Gly Asn Val Glu 
205 

Val Gly Gly His 



Asn Ser He Ser 

240 

Gin Thr Met Cys 
255 

Arg Tyr Ala Gly 
270 

Arg Met Gly Asp 
285 

Ser Ser Lys Phe 



Ala Ser Gly Ser 

320 

Lys Leu lie Pro 
335 

Ser lie Thr Ser 
350 

Asn Tyr Thr Phe 
365 
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Lys Asp Lys Gly 
370 

Met Val Leu Val 
385 

Trp Leu Asp Ser 



Val Ala Arg Gly 

420 

Glu Thr Asn Asn 
435 

Gly Asp Leu Asn 
450 



Gly Phe Ala Ser 
375 

Met Ser Leu Trp 
390 

Asp Tyr Pro Thr 
405 

Thr Cys Pro Thr 



Ala Ser Ala Ser 

440 

Ser Thr Tyr Thr 
455 



Met Thr Thr Ala 

380 

Asp Asp His Tyr 
395 

Asn Ala Asp Ser 
410 

Ser Ser Gly Val 
425 

Val Thr Tyr Ser 



Ala Gin 



Met Lys Asn Gly 



Ala Asn Met Leu 

400 

Ser Lys Pro Gly 
415 

Pro Ser Asp Val 
430 

Asn lie Arg Phe 
445 



<210> 41 

<211> 1353 

<212> DNA 

<213> Myceliophthora thermophila 
<220> 

<221> CDS 

<222> (1) . . (1353) 

<223> 

<400> 41 

atg aag cag tac etc cag tac etc gcg gcg acc ctg ccc ctg gtg ggc 48 
Met Lys Gin Tyr Leu Gin Tyr Leu Ala Ala Thr Leu Pro Leu Val Gly 
15 10 15 

ctg gec acg gec cag cag gcg ggt aac ctg cag acc gag act cac ccc 96 
Leu Ala Thr Ala Gin Gin Ala Gly Asn Leu Gin Thr Glu Thr His Pro 

20 25 30 



agg etc act tgg tec aag tgc acg gee ccg gga tec tgc caa cag gtc 144 
Arg Leu Thr Trp Ser Lys Cys Thr Ala Pro Gly Ser Cys Gin Gin Val 
35 40 45 

aac ggc gag gtc gtc ate gac tec aac tgg cgc tgg gtg cac gac gag 192 
Asn Gly Glu Val Val lie Asp Ser Asn Trp Arg Trp Val His Asp Glu 
50 55 60 

aac gcg cag aac tgc tac gac ggc aac cag tgg acc aac get tgc age 24 0 

Asn Ala Gin Asn Cys Tyr Asp Gly Asn Gin Trp Thr Asn Ala Cys Ser 
65 70 75 80 

tct gee acc gac tgc gec gag aat tgc gcg etc gag ggt gee gac tac 28 8 

Ser Ala Thr Asp Cys Ala Glu Asn Cys Ala Leu Glu Gly Ala Asp Tyr 

85 90 95 

cag ggc acc tat ggc gee teg acc age ggc aat gee ctg acg etc acc 33 6 

Gin Gly Thr Tyr Gly Ala Ser Thr Ser Gly Asn Ala Leu Thr Leu Thr 

100 105 110 

ttc gtc act aag cac gag tac ggc acc aac att ggc teg cgc etc tac 384 
Phe Val Thr Lys His Glu Tyr Gly Thr Asn lie Gly Ser Arg Leu Tyr 
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etc atg aac ggc gcg aac aag tac cag atg ttc acc etc aag ggc aac 432 
Leu Met Asn Gly Ala Asn Lys Tyr Gin Met Phe Thr Leu Lys Gly Asn 
130 135 140 

gag ctg gec ttc gac gtc gac etc teg gee gtc gag tgc ggc etc aac 480 
Glu Leu Ala Phe Asp Val Asp Leu Ser Ala Val Glu Cys Gly Leu Asn 
145 150 155 160 

age gee etc tac ttc gtg gee atg gag gag gat ggc ggt gtg teg age 528 
Ser Ala Leu Tyr Phe Val Ala Met Glu Glu Asp Gly Gly Val Ser Ser 

165 170 175 

tac ccg acc aac acg gee ggt get aag ttc ggc act ggg tac tgc gac 576 
Tyr Pro Thr Asn Thr Ala Gly Ala Lys Phe Gly Thr Gly Tyr Cys Asp 

180 185 190 

gee caa tgc gca cgc gac etc aag ttc gtc ggc ggc aag ggc aac ate 624 
Ala Gin Cys Ala Arg Asp Leu Lys Phe Val Gly Gly Lys Gly Asn lie 
195 200 205 

93-9 ggc tgg aag ccg tec acc aac gat gee aat gee ggt gtc ggt cct 672 
Glu Gly Trp Lys Pro Ser Thr Asn Asp Ala Asn Ala Gly Val Gly Pro 
210 215 220 

tat ggc ggg tgc tgc get gag ate gac gtc tgg gag teg aac aag tat 72 0 

Tyr Gly Gly Cys Cys Ala Glu lie Asp Val Trp Glu Ser Asn Lys Tyr 
225 230 235 240 

get ttc get ttc acc ccg cac ggt tgc gag aac cct aaa tac cac gtc 768 
Ala Phe Ala Phe Thr Pro His Gly Cys Glu Asn Pro Lys Tyr His Val 

245 250 255 

tgc gag acc acc aac tgc ggt ggc acc tac tec gag gac cgc ttc get 816 
Cys Glu Thr Thr Asn Cys Gly Gly Thr Tyr Ser Glu Asp Arg Phe Ala 

260 265 270 

ggt gac tgc gat gee aac ggc tgc gac tac aac ccc tac cgc atg ggc 864 
Gly Asp Cys Asp Ala Asn Gly Cys Asp Tyr Asn Pro Tyr Arg Met Gly 
275 280 285 

aac cag gac ttc tac ggt ccc ggc ttg acg gtc gat acc age aag aag 912 
Asn Gin Asp Phe Tyr Gly Pro Gly Leu Thr Val Asp Thr Ser Lys Lys 
290 295 300 

ttc acc gtc gtc age cag ttc gag gag aac aag etc acc cag ttc ttc 960 
Phe Thr Val Val Ser Gin Phe Glu Glu Asn Lys Leu Thr Gin Phe Phe 
305 310 315 320 

gtc cag gac ggc aag aag att gag ate ccc ggc ccc aag gtc gag ggc 100 8 

Val Gin Asp Gly Lys Lys lie Glu lie Pro Gly Pro Lys Val Glu Gly 

325 330 335 

ate gat gcg gac age gee get ate acc cct gag ctg tgc agt gee ctg 1056 
lie Asp Ala Asp Ser Ala Ala lie Thr Pro Glu Leu Cys Ser Ala Leu 

340 345 350 

ttc aag gee ttc gat gac cgt gac cgc ttc teg gag gtt ggc ggc ttc 1104 
Phe Lys Ala Phe Asp Asp Arg Asp Arg Phe Ser Glu Val Gly Gly Phe 
355 360 365 
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gat gcc ate 
Asp Ala lie 
370 

ate tgg gat 
lie Trp Asp 
385 

ccc cct gag 
Pro Pro Glu 



cag gac tct 
Gin Asp Ser 



aag gtc ate 
Lys Val He 
435 

aac gtc taa 
Asn Val 
450 



aac acg gcc 
Asn Thr Ala 



gat cac tac 
Asp His Tyr 
390 

aag get ggc 
Lys Ala Gly 
405 

ggc gtc ccg 
Gly Val Pro 
420 

tgg tec aac 
Trp Ser Asn 



etc age act 
Leu Ser Thr 
375 

gcc aat atg 
Ala Asn Met 



cag cct ggc 
Gin Pro Gly 



gcc gac gtt 
Ala Asp Val 
425 

ate cgc ttc 
He Arg Phe 
440 



ccc atg gtc 
Pro Met Val 
380 

etc tgg etc 
Leu Trp Leu 
395 

ggt gac cgt 
Gly Asp Arg 
410 

gag get cag 
Glu Ala Gin 



ggc ccc ate 
Gly Pro He 



etc gtc atg 
Leu Val Met 



gac teg age 
Asp Ser Ser 



ggc ccg tgt 
Gly Pro Cys 
415 

tac cct aat 
Tyr Pro Asn 
430 

ggc teg act 
Gly Ser Thr 
445 



tec 1152 
Ser 



tac 1200 

Tyr 

400 

cct 1248 
Pro 



gcc 1296 
Ala 



gtc 1344 
Val 



1353 



<210> 42 

<211> 450 

<212> PRT 

<213> Myceliophthora thermophila 



<400> 42 



Met Lys Gin Tyr 
1 

Leu Ala Thr Ala 

20 

Arg Leu Thr Trp 
35 

Asn Gly Glu Val 
50 

Asn Ala Gin Asn 
65 

Ser Ala Thr Asp 



Gin Gly Thr Tyr 

100 

Phe Val Thr Lys 
115 

Leu Met Asn Gly 
130 

Glu Leu Ala Phe 
145 



Leu Gin Tyr Leu 
5 

Gin Gin Ala Gly 



Ser Lys Cys Thr 

40 

Val He Asp Ser 
55 

Cys Tyr Asp Gly 
70 

Cys Ala Glu Asn 
85 

Gly Ala Ser Thr 



His Glu Tyr Gly 

120 

Ala Asn Lys Tyr 
135 

Asp Val Asp Leu 
150 



Ala Ala Thr Leu 
10 

Asn Leu Gin Thr 
25 

Ala Pro Gly Ser 



Asn Trp Arg Trp 

60 

Asn Gin Trp Thr 
75 

Cys Ala Leu Glu 
90 

Ser Gly Asn Ala 
105 

Thr Asn He Gly 



Gin Met Phe Thr 

140 

Ser Ala Val Glu 
155 



Pro Leu Val Gly 
15 

Glu Thr His Pro 
30 

Cys Gin Gin Val 
45 

Val His Asp Glu 



Asn Ala Cys Ser 

80 

Gly Ala Asp Tyr 
95 

Leu Thr Leu Thr 
110 

Ser Arg Leu Tyr 
125 

Leu Lys Gly Asn 



Cys Gly Leu Asn 

160 
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Ser Ala Leu Tyr 



Tyr Pro Thr Asn 

180 

Ala Gin Cys Ala 
195 

* 

Glu Gly Trp Lys 
210 

Tyr Gly Gly Cys 
225 

Ala Phe Ala Phe 



Cys Glu Thr Thr 

260 

Gly Asp Cys Asp 
275 

Asn Gin Asp Phe 
290 

Phe Thr Val Val 
305 

Val Gin Asp Gly 



lie Asp Ala Asp 

340 

Phe Lys Ala Phe 
355 

Asp Ala lie Asn 
370 

lie Trp Asp Asp 
385 

Pro Pro Glu Lys 



Gin Asp Ser Gly 

420 

Lys val lie Trp 

435 

Asn Val 

450 



<210> 43 
<211> 1341 
<212> DNA 



Phe Val Ala Met 
165 

Thr Ala Gly Ala 



Arg Asp Leu Lys 

200 

Pro Ser Thr Asn 
215 

Cys Ala Glu lie 
230 

Thr Pro His Gly 
245 

Asn Cys Gly Gly 



Ala Asn Gly Cys 

280 

Tyr Gly Pro Gly 
295 

Ser Gin Phe Glu 
310 

Lys Lys lie Glu 
325 

Ser Ala Ala lie 



Asp Asp Arg Asp 

360 

Thr Ala Leu Ser 
375 

His Tyr Ala Asn 
390 

Ala Gly Gin Pro 
405 

Val Pro Ala Asp 



Ser Asn lie Arg 

440 



Glu Glu Asp Gly 
170 

Lys Phe Gly Thr 
185 

Phe Val Gly Gly 



Asp Ala Asn Ala 

220 

Asp Val Trp Glu 
235 

Cys Glu Asn Pro 
250 

Thr Tyr Ser Glu 
265 

Asp Tyr Asn Pro 



Leu Thr Val Asp 

300 

Glu Asn Lys Leu 
315 

lie Pro Gly Pro 
330 

Thr Pro Glu Leu 
345 

Arg Phe Ser Glu 



Thr Pro Met Val 

380 

Met Leu Trp Leu 
395 

Gly Gly Asp Arg 
410 

Val Glu Ala Gin 
425 

Phe Gly Pro lie 



Gly Val Ser Ser 
175 

Gly Tyr Cys Asp 
190 

Lys Gly Asn lie 
205 

Gly Val Gly Pro 



Ser Asn Lys Tyr 

240 

Lys Tyr His Val 
255 

Asp Arg Phe Ala 
270 

Tyr Arg Met Gly 
285 

Thr Ser Lys Lys 



Thr Gin Phe Phe 

320 

Lys Val Glu Gly 
335 

Cys Ser Ala Leu 
350 

Val Gly Gly Phe 
365 

Leu Val Met Ser 



Asp Ser Ser Tyr 

400 

Gly Pro Cys Pro 
415 

Tyr Pro Asn Ala 
430 

Gly Ser Thr Val 
445 
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<213> Xylaria hypoxylon 

<220> 

<221> CDS 

<222> (1) . . (1341) 

<223> 

<400> 43 

atg ttg tec etc gec gtg teg gee gee ctt etc ggg etc geg tct gee 48 

Met Leu Ser Leu Ala Val Ser Ala Ala Leu Leu Gly Leu Ala Ser Ala 
15 10 15 

cag cag gtt gga aag gag caa tct gag act cac cct aag ctg tct tgg 96 
Gin Gin Val Gly Lys Glu Gin Ser Glu Thr His Pro Lys Leu Ser Trp 

20 25 30 

aag aag tgc acc age ggt ggt tec tgc ace cag ace aac get gag gtg 144 
Lys Lys Cys Thr Ser Gly Gly Ser Cys Thr Gin Thr Asn Ala Glu Val 
35 40 45 

acc ate gac tct aac tgg cga tgg ctt cac tct etc gaa ggc act gag 192 
Thr lie Asp Ser Asn Trp Arg Trp Leu His Ser Leu Glu Gly Thr Glu 
50 55 60 

aac tgc tac gat ggt aac aag tgg acc teg cag tgc age act ggc gag 24 0 

Asn Cys Tyr Asp Gly Asn Lys Trp Thr Ser Gin Cys Ser Thr Gly Glu 
65 70 75 80 

gac tgc gee acc aag tgc gee ate gag ggt gee gac tac age aag acc 2 88 

Asp Cys Ala Thr Lys Cys Ala lie Glu Gly Ala Asp Tyr Ser Lys Thr 

85 90 ~ 95 



tac ggt gec tct act age ggc gat get ctt acc etc aag ttc ctg acc 
Tyr Gly Ala Ser Thr Ser Gly Asp Ala Leu Thr Leu Lys Phe Leu Thr 

100 105 no 



336 



aag cac gag tac gga acc aac ate ggc tec cga ttc tac ctt atg aat 3 84 

Lys His Glu Tyr Gly Thr Asn lie Gly Ser Arg Phe Tyr Leu Met Asn 
115 120 125 

ggt gee gac aag tac cag acc ttc gac etc aag ggt aac gag ttc acc 432 

Gly Ala Asp Lys Tyr Gin Thr Phe Asp Leu Lys Gly Asn Glu Phe Thr 
130 135 140 

ttc gat gtc gac ctg tec acc gtc gac tgt ggt ctt aac gee get ctt 480 

Phe Asp Val Asp Leu Ser Thr Val Asp Cys Gly Leu Asn Ala Ala Leu 

145 150 155 160 

tac ttc gtc gec atg gag gaa gac ggt ggc atg get age tac ccc aac 52 8 

Tyr Phe Val Ala Met Glu Glu Asp Gly Gly Met Ala Ser Tyr Pro Asn 

165 170 175 

aac aag gee ggt gee aag tac ggt acc ggt tac tgt gac get cag tgt 576 

Asn Lys Ala Gly Ala Lys Tyr Gly Thr Gly Tyr Cys Asp Ala Gin Cys 

180 185 190 

gec cgt gac ttg aag ttc gtc ggt ggc aag ggc aac gtt gag gga tgg 624 

Ala Arg Asp Leu Lys Phe Val Gly Gly Lys Gly Asn Val Glu Gly Trp 
195 200 205 



gag cca tec acc aac gac gac aac gee ggt gtt ggc cct tac ggt gee 672 
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Glu Pro Ser Thr Asn Asp Asp Asn Ala Gly Val Gly Pro Tyr Gly Ala 
210 215 220 

tgc tgt gcc gaa ate gat gtc tgg gag tec aac tct cac tct ttc get 72 0 

Cys Cys Ala Glu He Asp Val Trp Glu Ser Asn Ser His Ser Phe Ala 
225 230 235 240 

ttc acc cct cac cct tgc acc acc aac gaa tac cac gtc tgt gag cag 768 
Phe Thr Pro His Pro Cys Thr Thr Asn Glu Tyr His Val Cys Glu Gin 

245 250 255 

gac gag tgt ggt ggt acc tac tct gag gac cga ttc get ggc aag tgt 816 
Asp Glu Cys Gly Gly Thr Tyr Ser Glu Asp Arg Phe Ala Gly Lys Cys 

260 265 270 

gat gcc aac ggt tgt gac tac aac cct tac cgc atg ggt aac acc gac 864 
Asp Ala Asn Gly Cys Asp Tyr Asn Pro Tyr Arg Met Gly Asn Thr Asp 
275 280 285 

ttc tac ggc cag ggc aag acc gtc gac acc age aag aaa ttc act gtt 912 
Phe Tyr Gly Gin Gly Lys Thr Val Asp Thr Ser Lys Lys Phe Thr Val 
290 295 300 

gtc acc cag ttc gcc gaa aac aag ttg act cag ttc ttc gtc cag gac 960 
Val Thr Gin Phe Ala Glu Asn Lys Leu Thr Gin Phe Phe Val Gin Asp 
305 310 315 320 

ggt aag aag att gag ate ccc ggt ccc aag att gac ggt ttc cct acc 1008 
Gly Lys Lys He Glu He Pro Gly Pro Lys He Asp Gly Phe Pro Thr 

325 330 335 

gat age gcc ate acc ccc gag tac tgc act gcc gaa ttc aac gtt eta 1056 
Asp Ser Ala He Thr Pro Glu Tyr Cys Thr Ala Glu Phe Asn Val Leu 

340 345 350 

gga gac cgt gac cgc ttc agt gaa gtt ggt ggc ttc gac cag etc aac 1104 
Gly Asp Arg Asp Arg Phe Ser Glu Val Gly Gly Phe Asp Gin Leu Asn 
355 360 365 

aac get ctt gac gta ccc atg gtc ctt gtc atg tec ate tgg gac gac 1152 
Asn Ala Leu Asp Val Pro Met Val Leu Val Met Ser He Trp Asp Asp 
370 375 380 

cac tac gcc aac atg ctt tgg etc gac tec age tac ccc cct gag aag 1200 
His Tyr Ala Asn Met Leu Trp Leu Asp Ser Ser Tyr Pro Pro Glu Lys 
385 390 395 400 

get ggc cag ccc ggt ggt gac cgt ggt gac tgt gcc ccc gac tec ggt 124 8 

Ala Gly Gin Pro Gly Gly Asp Arg Gly Asp Cys Ala Pro Asp Ser Gly 

405 410 415 

gtc ccc tec gac gtc gag gcc age ate ccc gat gcc aag gtc gtc tgg 12 96 

Val Pro Ser Asp Val Glu Ala Ser He Pro Asp Ala Lys Val Val Trp 

420 425 430 

tec aac ate cgc ttc ggt ccc ate ggc tct act gtc gag gtt taa 1341 
Ser Asn He Arg Phe Gly Pro He Gly Ser Thr Val Glu Val 
435 440 445 



<210> 44 
<211> 446 
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<212> PRT 

<213> Xylaria hypoxylon 
<400> 44 

Met Leu Ser Leu Ala Val Ser Ala Ala Leu Leu Gly Leu Ala Ser Ala 
15 10 15 

Gin Gin Val Gly Lys Glu Gin Ser Glu Thr His Pro Lys Leu Ser Trp 

20 25 30 

Lys Lys Cys Thr Ser Gly Gly Ser Cys Thr Gin Thr Asn Ala Glu Val 
35 40 45 

Thr lie Asp Ser Asn Trp Arg Trp Leu His Ser Leu Glu Gly Thr Glu 
50 55 60 

Asn Cys Tyr Asp Gly Asn Lys Trp Thr Ser Gin Cys Ser Thr Gly Glu 
65 70 75 80 

Asp Cys Ala Thr Lys Cys Ala He Glu Gly Ala Asp Tyr Ser Lys Thr 

85 90 95 

Tyr Gly Ala Ser Thr Ser Gly Asp Ala Leu Thr Leu Lys Phe Leu Thr 

100 105 110 

Lys His Glu Tyr Gly Thr Asn He Gly Ser Arg Phe Tyr Leu Met Asn 
115 120 125 

Gly Ala Asp Lys Tyr Gin Thr Phe Asp Leu Lys Gly Asn Glu Phe Thr 
130 135 140 

Phe Asp Val Asp Leu Ser Thr Val Asp Cys Gly Leu Asn Ala Ala Leu 
145 150 155 160 

Tyr Phe Val Ala Met Glu Glu Asp Gly Gly Met Ala Ser Tyr Pro Asn 

165 170 175 

Asn Lys Ala Gly Ala Lys Tyr Gly Thr Gly Tyr Cys Asp Ala Gin Cys 

180 185 190 

Ala Arg Asp Leu Lys Phe Val Gly Gly Lys Gly Asn Val Glu Gly Trp 
195 200 205 

Glu Pro Ser Thr Asn Asp Asp Asn Ala Gly Val Gly Pro Tyr Gly Ala 
210 215 220 

Cys Cys Ala Glu He Asp Val Trp Glu Ser Asn Ser His Ser Phe Ala 
225 230 235 240 

Phe Thr Pro His Pro Cys Thr Thr Asn Glu Tyr His Val Cys Glu Gin 

245 250 255 

Asp Glu Cys Gly Gly Thr Tyr Ser Glu Asp Arg Phe Ala Gly Lys Cys 

260 265 270 

Asp Ala Asn Gly Cys Asp Tyr Asn Pro Tyr Arg Met Gly Asn Thr Asp 
275 280 285 

Phe Tyr Gly Gin Gly Lys Thr Val Asp Thr Ser Lys Lys Phe Thr Val 
290 295 300 
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Val Thr Gin Phe Ala Glu Asn Lys Leu Thr Gin Phe Phe Val Gin Asp 
305 310 315 320 

Gly Lys Lys lie Glu lie Pro Gly Pro Lys lie Asp Gly Phe Pro Thr 

325 330 335 

Asp Ser Ala lie Thr Pro Glu Tyr Cys Thr Ala Glu Phe Asn Val Leu 

340 345 350 

Gly Asp Arg Asp Arg Phe Ser Glu Val Gly Gly Phe Asp Gin Leu Asn 
355 360 365 

Asn Ala Leu Asp Val Pro Met Val Leu Val Met Ser lie Trp Asp Asp 
370 375 380 

His Tyr Ala Asn Met Leu Trp Leu Asp Ser Ser Tyr Pro Pro Glu Lys 
385 390 395 400 

Ala Gly Gin Pro Gly Gly Asp Arg Gly Asp Cys Ala Pro Asp Ser Gly 

405 410 415 

Val Pro Ser Asp Val Glu Ala Ser lie Pro Asp Ala Lys Val Val Trp 

420 425 430 

Ser Asn lie Arg Phe Gly Pro lie Gly Ser Thr Val Glu Val 
435 440 445 



<210> 45 

<211> 1584 

<212> DNA 

<213> Exidia glandulosa 
<220> 

<221> CDS 

<222> (1) . . (1584) 

<223 > 

<400> 45 

atg tac gcc aag ttc get acc etc get gec etc gtg gca get gee age 48 
Met Tyr Ala Lys Phe Ala Thr Leu Ala Ala Leu Val Ala Ala Ala Ser 
15 10 15 

gcc cag cag gca tgc aca etc acc gcc gag aac cat ccc tec atg act 96 
Ala Gin Gin Ala Cys Thr Leu Thr Ala Glu Asn His Pro Ser Met Thr 

20 25 30 

tgg tct aag tgt gcc gcc gga ggt age tgc act teg gtt tct ggt tea 144 
Trp Ser Lys Cys Ala Ala Gly Gly Ser Cys Thr Ser Val Ser Gly Ser 
35 40 45 

gtc acc ate gat gcc aac tgg cga tgg ctt cac cag etc aac age gcc 192 
Val Thr lie Asp Ala Asn Trp Arg Trp Leu His Gin Leu Asn Ser Ala 
50 55 60 



acc aac tgc tac gac ggc aac aag tgg aac acc acc tac tgc age aca 24 0 

Thr Asn Cys Tyr Asp Gly Asn Lys Trp Asn Thr Thr Tyr Cys Ser Thr 
65 70 75 80 

gat get act tgc get get cag tgc tgt gtt gat ggc tea gac tat get 288 
Asp Ala Thr Cys Ala Ala Gin Cys Cys Val Asp Gly Ser Asp Tyr Ala 
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85 90 95 

ggc acc tac ggt gcc acc act age ggt aac get ctg aac etc aag ttc 33 6 

Gly Thr Tyr Gly Ala Thr Thr Ser Gly Asn Ala Leu Asn Leu Lys Phe 

100 105 110 

gtc acc caa ggg tec tat tct aag aac ate ggt tec egg ttg tac etc 384 
Val Thr Gin Gly Ser Tyr Ser Lys Asn lie Gly Ser Arg Leu Tyr Leu 
115 120 125 

atg gag teg gat acc aag tat cag atg ttt caa ctg etc ggc cag gag 432 
Met Glu Ser Asp Thr Lys Tyr Gin Met Phe Gin Leu Leu Gly Gin Glu 
130 135 140 

ttc act ttc gac gta gat gtc tec aac ttg ggc tgc ggt etc aac ggt 480 
Phe Thr Phe Asp Val Asp Val Ser Asn Leu Gly Cys Gly Leu Asn Gly 
145 150 155 160 

gcc etc tac ttc gtc age atg gac get gac ggt ggc acg tec aag tat 52 8 

Ala Leu Tyr Phe Val Ser Met Asp Ala Asp Gly Gly Thr Ser Lys Tyr 

165 170 175 

acc ggc aac aag gcc ggc gcc aag tat ggc act ggc tac tgc gac age 576 
Thr Gly Asn Lys Ala Gly Ala Lys Tyr Gly Thr Gly Tyr Cys Asp Ser 

180 185 190 

cag tgc ccg cgc gac ctg aag ttc ate aat ggt cag gcc aac gtc gag 624 
Gin Cys Pro Arg Asp Leu Lys Phe lie Asn Gly Gin Ala Asn Val Glu 
195 200 205 

99C tgg act cct tec acc aac gat gcc aac gcc ggc att ggc acc cac 672 
Gly Trp Thr Pro Ser Thr Asn Asp Ala Asn Ala Gly lie Gly Thr His 
210 215 220 

ggc tec tgc tgt teg gag atg gac ate tgg gag get aac aat gtt gcc 720 
Gly Ser Cys Cys Ser Glu Met Asp He Trp Glu Ala Asn Asn Val Ala 
225 230 235 240 

get gcg tac acc ccc cat cct tgc aca act ate ggc cag teg ate tgc 768 
Ala Ala Tyr Thr Pro His Pro Cys Thr Thr He Gly Gin Ser He Cys 

245 250 255 

teg ggc gat tct tgc gga gga acc tac age tct gac cgt tac gcc ggt 816 
Ser Gly Asp Ser Cys Gly Gly Thr Tyr Ser Ser Asp Arg Tyr Ala Gly 

260 265 270 

gtc tgc gat cca gac ggt tgc gat ttc aac age tac cgc atg ggc gac 864 
Val Cys Asp Pro Asp Gly Cys Asp Phe Asn Ser Tyr Arg Met Gly Asp 
275 280 285 

acg ggc ttc tac ggc aag ggc ctg aca gtc gac acg age tec aag ttc 912 
Thr Gly Phe Tyr Gly Lys Gly Leu Thr Val Asp Thr Ser Ser Lys Phe 
290 295 300 

acc gtc gtc acc cag ttc etc acc ggc tec gac ggc aac ctt tec gag 960 
Thr Val Val Thr Gin Phe Leu Thr Gly Ser Asp Gly Asn Leu Ser Glu 
305 310 315 320 

ate aag cgc ttc tac gtc cag aac ggc aag gtc att ccc aac teg cag 1008 
He Lys Arg Phe Tyr Val Gin Asn Gly Lys Val He Pro Asn Ser Gin 

325 330 335 
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tec aag att gec ggc gtc age ggc aac tec ate ace ace gac ttc tgc 1056 
Ser Lys lie Ala Gly Val Ser Gly Asn Ser lie Thr Thr Asp Phe Cys 

340 345 350 

tec gee cag aag ace gee ttc ggc gac acc aac gtc ttc gcg caa aag 1104 
Ser Ala Gin Lys Thr Ala Phe Gly Asp Thr Asn Val Phe Ala Gin Lys 
355 360 365 

gga ggt etc gee ggg atg ggc gee gee etc aag gee ggc atg gtc etc 1152 
Gly Gly Leu Ala Gly Met Gly Ala Ala Leu Lys Ala Gly Met Val Leu 
370 375 380 

gtc atg tec ate tgg gac gac cac gca gtc aac atg ctg tgg ctg gac 1200 
Val Met Ser lie Trp Asp Asp His Ala Val Asn Met Leu Trp Leu Asp 
385 390 395 400 

teg acc tac ccg acc gac age acc aag ccc ggc gcg gec cgc ggc acc 1248 
Ser Thr Tyr Pro Thr Asp Ser Thr Lys Pro Gly Ala Ala Arg Gly Thr 

405 410 415 

tgc ccg acc acc tec ggc gtc ccc gee gac gtc gag gee cag gtc ccc 1296 
Cys Pro Thr Thr Ser Gly Val Pro Ala Asp Val Glu Ala Gin Val Pro 

420 425 430 

aac teg aac gtc ate tac tec aac ate aag gtc ggc ccc ate aac teg 1344 
Asn Ser Asn Val lie Tyr Ser Asn lie Lys Val Gly Pro He Asn Ser 
435 440 445 

act ttc acc ggc ggc act tec ggc ggc ggc ggt age age age age tec 13 92 

Thr Phe Thr Gly Gly Thr Ser Gly Gly Gly Gly Ser Ser Ser Ser Ser 
450 455 460 

acc acc ate cga acc age acc acc age act cgc acc acc age acc age 144 0 

Thr Thr He Arg Thr Ser Thr Thr Ser Thr Arg Thr Thr Ser Thr Ser 
465 470 475 480 

acc gcg ccc ggc ggc ggc tec act ggc age gec ggc gee gat cac tgg 14 88 

Thr Ala Pro Gly Gly Gly Ser Thr Gly Ser Ala Gly Ala Asp His Trp 

485 490 495 

gcg caa tgc ggc ggt ate ggc tgg act ggt ccc acg acc tgc aag age 1536 
Ala Gin Cys Gly Gly He Gly Trp Thr Gly Pro Thr Thr Cys Lys Ser 

500 505 510 

ccg tac acg tgc aca gec tec aac ccg tac tac teg cag tgc ttg taa 1584 
Pro Tyr Thr Cys Thr Ala Ser Asn Pro Tyr Tyr Ser Gin Cys Leu 
515 520 525 



<210> 46 
<211> 527 
<212> PRT 

<213> Exidia glandulosa 
<400> 46 

Met Tyr Ala Lys Phe Ala Thr Leu Ala Ala Leu Val Ala Ala Ala Ser 
15 10 15 

Ala Gin Gin Ala Cys Thr Leu Thr Ala Glu Asn His Pro Ser Met Thr 

20 25 30 
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Trp Ser Lys Cys 
35 

Val Thr lie Asp 
50 

Thr Asn Cys Tyr 
65 

Asp Ala Thr Cys 



Gly Thr Tyr Gly 

100 

Val Thr Gin Gly 
115 

Met Glu Ser Asp 
130 

Phe Thr Phe Asp 
145 

Ala Leu Tyr Phe 



Thr Gly Asn Lys 

180 

Gin Cys Pro Arg 
195 

Gly Trp Thr Pro 
210 

Gly Ser Cys Cys 
225 

Ala Ala Tyr Thr 



Ser Gly Asp Ser 

260 

Val Cys Asp Pro 
275 

Thr Gly Phe Tyr 
290 

Thr Val Val Thr 
305 

He Lys Arg Phe 



Ser Lys He Ala 

340 

Ser Ala Gin Lys 
355 



Ala Ala Gly Gly 

40 

Ala Asn Trp Arg 
55 

Asp Gly Asn Lys 
70 

Ala Ala Gin Cys 
85 

Ala Thr Thr Ser 



Ser Tyr Ser Lys 

120 

Thr Lys Tyr Gin 
135 

Val Asp Val Ser 
150 

Val Ser Met Asp 
165 

Ala Gly Ala Lys 



Asp Leu Lys Phe 

200 

Ser Thr Asn Asp 
215 

Ser Glu Met Asp 
230 

Pro His Pro Cys 
245 

Cys Gly Gly Thr 



Asp Gly Cys Asp 

280 

Gly Lys Gly Leu 
295 

Gin Phe Leu Thr 
310 

Tyr Val Gin Asn 
325 

Gly Val Ser Gly 



Thr Ala Phe Gly 

360 



Ser Cys Thr Ser 



Trp Leu His Gin 

60 

Trp Asn Thr Thr 
75 

Cys Val Asp Gly 
90 

Gly Asn Ala Leu 
105 

Asn He Gly Ser 



Met Phe Gin Leu 

140 

Asn Leu Gly Cys 
155 

Ala Asp Gly Gly 
170 

Tyr Gly Thr Gly 
185 

He Asn Gly Gin 



Ala Asn Ala Gly 

220 

He Trp Glu Ala 
235 

Thr Thr He Gly 
250 

Tyr Ser Ser Asp 
265 

Phe Asn Ser Tyr 



Thr Val Asp Thr 

300 

Gly Ser Asp Gly 
315 

Gly Lys Val He 
330 

Asn Ser He Thr 
345 

Asp Thr Asn Val 



Val Ser Gly Ser 
45 

Leu Asn Ser Ala 



Tyr Cys Ser Thr 

80 

Ser Asp Tyr Ala 
95 

Asn Leu Lys Phe 
110 

Arg Leu Tyr Leu 
125 

Leu Gly Gin Glu 



Gly Leu Asn Gly 

160 

Thr Ser Lys Tyr 
175 

Tyr Cys Asp Ser 
190 

Ala Asn Val Glu 
205 

He Gly Thr His 



Asn Asn Val Ala 

240 

Gin Ser He Cys 
255 

Arg Tyr Ala Gly 
270 

Arg Met Gly Asp 
285 

Ser Ser Lys Phe 



Asn Leu Ser Glu 

320 

Pro Asn Ser Gin 
335 

Thr Asp Phe Cys 
350 

Phe Ala Gin Lys 
365 
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Gly Gly Leu Ala 
370 

Val Met Ser lie 
385 

Ser Thr Tyr Pro 



Cys Pro Thr Thr 

420 

Asn Ser Asn Val 
435 

Thr Phe Thr Gly 
450 

Thr Thr lie Arg 
465 

Thr Ala Pro Gly 



Ala Gin Cys Gly 

500 

Pro Tyr Thr Cys 
515 



Gly Met Gly Ala 
375 

Trp Asp Asp His 
390 

Thr Asp Ser Thr 
405 

Ser Gly Val Pro 



lie Tyr Ser Asn 

440 

Gly Thr Ser Gly 
455 

Thr Ser Thr Thr 
470 

Gly Gly Ser Thr 
485 

Gly lie Gly Trp 



Thr Ala Ser Asn 

520 



Ala Leu Lys Ala 

380 

Ala Val Asn Met 

395 

Lys Pro Gly Ala 
410 

Ala Asp Val Glu 
425 

lie Lys Val Gly 



Gly Gly Gly Ser 

460 

Ser Thr Arg Thr 
475 

Gly Ser Ala Gly 
490 

Thr Gly Pro Thr 
505 

Pro Tyr Tyr Ser 



Gly Met Val Leu 



Leu Trp Leu Asp 

400 

Ala Arg Gly Thr 
415 

Ala Gin Val Pro 
430 

Pro lie Asn Ser 
445 

Ser Ser Ser Ser 



Thr Ser Thr Ser 

480 

Ala Asp His Trp 
495 

Thr Cys Lys Ser 
510 

Gin Cys Leu 
525 



<210> 47 

<211> 1368 

<212> DNA 

<213> Exidia glandulosa 
<220> 

<221> CDS 

<222> (1) . . (1368) 

<223> 

<400> 47 

atg tac gcc aag ttc get acc etc get gee etc gtg gca get gee age 48 
Met Tyr Ala Lys Phe Ala Thr Leu Ala Ala Leu Val Ala Ala Ala Ser 
15 10 15 

gcc cag cag gca tgc aca etc acc gcc gag aac cat ccc tec atg act 96 
Ala Gin Gin Ala Cys Thr Leu Thr Ala Glu Asn His Pro Ser Met Thr 

20 25 30 

tgg tct aag tgt gcc gcc gga ggt age tgc act teg gtt tct ggt tea 144 
Trp Ser Lys Cys Ala Ala Gly Gly Ser Cys Thr Ser Val Ser Gly Ser 
35 40 45 



gtc acc ate gat gcc aac tgg cga tgg ctt cac cag etc aac age gcc 192 

Val Thr lie Asp Ala Asn Trp Arg Trp Leu His Gin Leu Asn Ser Ala 

50 55 60 

acc aac tgc tac gac ggc aac aag tgg aac acc acc tac tgc age aca 240 

Thr Asn Cys Tyr Asp Gly Asn Lys Trp Asn Thr Thr Tyr Cys Ser Thr 
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70 
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80 



gat get act tgc get get cag tgc tgt gtt gat ggc tea gac tat get 288 
Asp Ala Thr Cys Ala Ala Gin Cys Cys Val Asp Gly Ser Asp Tyr Ala 

85 90 95 

ggc acc tac ggt gec ace act age ggt aac get ctg aac etc aag ttc 336 
Gly Thr Tyr Gly Ala Thr Thr Ser Gly Asn Ala Leu Asn Leu Lys Phe 

100 105 110 

gtc acc caa ggg tec tat tct aag aac ate ggt tec egg ttg tac etc 384 
Val Thr Gin Gly Ser Tyr Ser Lys Asn lie Gly Ser Arg Leu Tyr Leu 
115 120 125 

atg gag teg gat acc aag tat cag atg ttt caa ctg etc ggc cag gag 432 
Met Glu Ser Asp Thr Lys Tyr Gin Met Phe Gin Leu Leu Gly Gin Glu 
130 135 140 

ttc act ttc gac gta gat gtc tec aac ttg ggc tgc ggt etc aac ggt 480 
Phe Thr Phe Asp Val Asp Val Ser Asn Leu Gly Cys Gly Leu Asn Gly 
145 150 155 160 

gec etc tac ttc gtc age atg gac get gac ggt ggc acg tec aag tat 528 
Ala Leu Tyr Phe Val Ser Met Asp Ala Asp Gly Gly Thr Ser Lys Tyr 

165 170 175 

acc ggc aac aag gee ggc gee aag tat ggc act ggc tac tgc gac age 576 
Thr Gly Asn Lys Ala Gly Ala Lys Tyr Gly Thr Gly Tyr Cys Asp Ser 

180 185 190 

cag tgc ccg cgc gac ctg aag ttc ate aat ggt cag gee aac gtc gag 624 
Gin Cys Pro Arg Asp Leu Lys Phe lie Asn Gly Gin Ala Asn Val Glu 
195 200 205 

ggc tgg act cct tec acc aac gat gee aac gee ggc att ggc acc cac 672 
Gly Trp Thr Pro Ser Thr Asn Asp Ala Asn Ala Gly lie Gly Thr His 
210 215 220 

ggc tec tgc tgt teg gag atg gac ate tgg gag get aac aat gtt gec 72 0 

Gly Ser Cys Cys Ser Glu Met Asp lie Trp Glu Ala Asn Asn Val Ala 
225 230 235 240 

get gcg tac acc ccc cat cct tgc aca act ate ggc cag teg ate tgc 768 
Ala Ala Tyr Thr Pro His Pro Cys Thr Thr lie Gly Gin Ser lie Cys 

245 250 255 

teg ggc gat tct tgc gga gga acc tac age tct gac cgt tac gee ggt 816 
Ser Gly Asp Ser Cys Gly Gly Thr Tyr Ser Ser Asp Arg Tyr Ala Gly 

260 265 270 

gtc tgc gat cca gac ggt tgc gat ttc aac age tac cgc atg ggc gac 864 
Val Cys Asp Pro Asp Gly Cys Asp Phe Asn Ser Tyr Arg Met Gly Asp 
275 280 285 

acg ggc ttc tac ggc aag ggc ctg aca gtc gac acg age tec aag ttc 912 
Thr Gly Phe Tyr Gly Lys Gly Leu Thr Val Asp Thr Ser Ser Lys Phe 
290 295 300 

acc gtc gtc acc cag ttc etc acc ggc tec gac ggc aac ctt tec gag 960 
Thr Val Val Thr Gin Phe Leu Thr Gly Ser Asp Gly Asn Leu Ser Glu 
305 310 315 320 
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ate aag cgc ttc tac gtc cag aac ggc aag gtc att ccc aac teg cag 1008 
lie Lys Arg Phe Tyr Val Gin Asn Gly Lys Val He Pro Asn Ser Gin 

325 330 335 

tec aag att gee ggc gtc age ggc aac tec ate ace acc gac ttc tgc 1056 
Ser Lys He Ala Gly Val Ser Gly Asn Ser He Thr Thr Asp Phe Cys 

340 345 350 

tec gee cag aag acc gee ttc ggc gac acc aac gtc ttc gcg caa aag 1104 
Ser Ala Gin Lys Thr Ala Phe Gly Asp Thr Asn Val Phe Ala Gin Lys 
355 360 365 

gga ggt etc gee ggg atg ggc gee gee etc aag gee ggc atg gtc etc 1152 
Gly Gly Leu Ala Gly Met Gly Ala Ala Leu Lys Ala Gly Met Val Leu 
370 375 380 

gtc atg tec ate tgg gac gat cac tac gee aac atg ctg tgg etc gac 1200 
Val Met Ser He Trp Asp Asp His Tyr Ala Asn Met Leu Trp Leu Asp 
385 390 395 400 

teg acc tac ccg act gac gee tct ccc gat gag ccc ggc aag ggc cgc 1248 
Ser Thr Tyr Pro Thr Asp Ala Ser Pro Asp Glu Pro Gly Lys Gly Arg 

405 410 415 

ggc acc tgc gac acc age teg ggt gtt cct get gac ate gag acc age 12 96 

Gly Thr Cys Asp Thr Ser Ser Gly Val Pro Ala Asp He Glu Thr Ser 

420 425 430 

cag gee age aac tea gtc ate tac teg aac ate aag ttc gga ccc ate 1344 
Gin Ala Ser Asn Ser Val He Tyr Ser Asn He Lys Phe Gly Pro He 
435 440 445 

aac teg acc ttc aag gcg tec taa 1368 
Asn Ser Thr Phe Lys Ala Ser 
450 455 



<210> 48 

<211> 455 

<212> PRT 

<213> Exidia glandulosa 



<400> 48 



Met Tyr Ala Lys 
1 

Ala Gin Gin Ala 

20 

Trp Ser Lys Cys 
35 

Val Thr He Asp 
50 

Thr Asn Cys Tyr 
65 

Asp Ala Thr Cys 



Phe Ala Thr Leu 

5 

Cys Thr Leu Thr 



Ala Ala Gly Gly 

40 

Ala Asn Trp Arg 
55 

Asp Gly Asn Lys 
70 

Ala Ala Gin Cys 
85 



Ala Ala Leu Val 
10 

Ala Glu Asn His 
25 

Ser Cys Thr Ser 



Trp Leu His Gin 

60 

Trp Asn Thr Thr 
75 

Cys Val Asp Gly 
90 



Ala Ala Ala Ser 
15 

Pro Ser Met Thr 
30 

Val Ser Gly Ser 
45 

Leu Asn Ser Ala 



Tyr Cys Ser Thr 

80 

Ser Asp Tyr Ala 
95 
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Gly Thr Tyr Gly 

100 

Val Thr Gin Gly 
115 

Met Glu Ser Asp 
130 

Phe Thr Phe Asp 
145 

Ala Leu Tyr Phe 



Thr Gly Asn Lys 

180 

Gin Cys Pro Arg 
195 

Gly Trp Thr Pro 
210 

Gly Ser Cys Cys 
225 

Ala Ala Tyr Thr 



Ser Gly Asp Ser 

260 

Val Cys Asp Pro 
275 

Thr Gly Phe Tyr 
290 

Thr Val Val Thr 
305 

lie Lys Arg Phe 



Ser Lys lie Ala 

340 

Ser Ala Gin Lys 
355 

Gly Gly Leu Ala 
370 

Val Met Ser lie 
385 

Ser Thr Tyr Pro 



Gly Thr Cys Asp 

420 



Ala Thr Thr Ser 



Ser Tyr Ser Lys 

120 

Thr Lys Tyr Gin 
135 

Val Asp Val Ser 
150 

Val Ser Met Asp 
165 

Ala Gly Ala Lys 



Asp Leu Lys Phe 

200 

Ser Thr Asn Asp 
215 

Ser Glu Met Asp 
230 

Pro His Pro Cys 
245 

Cys Gly Gly Thr 



Asp Gly Cys Asp 

280 

Gly Lys Gly Leu 
295 

Gin Phe Leu Thr 
310 

Tyr Val Gin Asn 
325 

Gly Val Ser Gly 



Thr Ala Phe Gly 

360 

Gly Met Gly Ala 
375 

Trp Asp Asp His 
390 

Thr Asp Ala Ser 
405 

Thr Ser Ser Gly 



Gly Asn Ala Leu 
105 

Asn lie Gly Ser 



Met Phe Gin Leu 

140 

Asn Leu Gly Cys 
155 

Ala Asp Gly Gly 
170 

Tyr Gly Thr Gly 
185 

lie Asn Gly Gin 



Ala Asn Ala Gly 

220 

lie Trp Glu Ala 
235 

Thr Thr lie Gly 
250 

Tyr Ser Ser Asp 
265 

Phe Asn Ser Tyr 



Thr Val Asp Thr 

300 

Gly Ser Asp Gly 
315 

Gly Lys Val He 
330 

Asn Ser' He Thr 
345 

Asp Thr Asn Val 



Ala Leu Lys Ala 

380 

Tyr Ala Asn Met 
395 

Pro Asp Glu Pro 
410 

Val Pro Ala Asp 
425 



Asn Leu Lys Phe 
110 

Arg Leu Tyr Leu 
125 

Leu Gly Gin Glu 



Gly Leu Asn Gly 

160 

Thr Ser Lys Tyr 
175 

Tyr Cys Asp Ser 
190 

Ala Asn Val Glu 
205 

He Gly Thr His 



Asn Asn Val Ala 

240 

Gin Ser He Cys 
255 

Arg Tyr Ala Gly 
270 

Arg Met Gly Asp 
285 

Ser Ser Lys Phe 



Asn Leu Ser Glu 

320 

Pro Asn Ser Gin 
335 

Thr Asp Phe Cys 
350 

Phe Ala Gin Lys 
365 

Gly Met Val Leu 



Leu Trp Leu Asp 

400 

Gly Lys Gly Arg 
415 

He Glu Thr Ser 
430 
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Gin Ala Ser Asn Ser Val lie Tyr Ser Asn lie Lys Phe Gly Pro lie 
435 440 445 

Asn Ser Thr Phe Lys Ala Ser 
450 455 



<210> 49 

<211> 1395 

<212> DNA 

<213> Poitrasia circinans 
<220> 

<221> CDS 

<222> (1) . . (1395) 

<223> 

<400> 49 

atg cat cag act tec gtt ctt tct teg etc tct ttg etc etc gca gee 48 

Met His Gin Thr Ser Val Leu Ser Ser Leu Ser Leu Leu Leu Ala Ala 
15 10 15 

tec ggt gee cag cag gtc ggc acc cag aat get gag act cac ccg agt 96 
Ser Gly Ala Gin Gin Val Gly Thr Gin Asn Ala Glu Thr His Pro Ser 

20 25 30 

ctg acc acc cag aag tgt acc acc gac ggc ggc tgc acc gac cag tec 144 
Leu Thr Thr Gin Lys Cys Thr Thr Asp Gly Gly Cys Thr Asp Gin Ser 
35 40 45 

act gee ate gtg ctt gac gee aac tgg cgc tgg ctg cac acc acc gag 192 
Thr Ala lie Val Leu Asp Ala Asn Trp Arg Trp Leu His Thr Thr Glu 
50 55 60 

ggc tac acc aac tgc tac act ggc cag gaa tgg gac acc gac ate tgc 240 
Gly Tyr Thr Asn Cys Tyr Thr Gly Gin Glu Trp Asp Thr Asp lie Cys 
65 70 75 80 

tec tec ccg gag get tgc gee acc ggc tgc get ctt gac ggt gee gac 288 
Ser Ser Pro Glu Ala Cys Ala Thr Gly Cys Ala Leu Asp Gly Ala Asp 

85 90 95 

tac gag ggc act tac ggc att acg act gac ggc aac get ctt tec atg 336 
Tyr Glu Gly Thr Tyr Gly lie Thr Thr Asp Gly Asn Ala Leu Ser Met 

100 105 110 

aag ttt gtc acc cag ggc teg cag aag aac gtc ggc ggt cgt gtt tac 3 84 

Lys Phe Val Thr Gin Gly Ser Gin Lys Asn Val Gly Gly Arg Val Tyr 
115 120 125 

ctg ctt get ccc gac tec gaa gat gcg tac gag etc ttc aag ttg aag 432 
Leu Leu Ala Pro Asp Ser Glu Asp Ala Tyr Glu Leu Phe Lys Leu Lys 
130 135 140 

aac cag gag ttc act ttc gac gtt gac gtc tec gac etc ccc tgc ggc 480 
Asn Gin Glu Phe Thr Phe Asp Val Asp Val Ser Asp Leu Pro Cys Gly 
145 150 155 160 

ctg aac ggc gee ctg tac ttc tec gag atg gat gaa gat ggt ggc atg 528 
Leu Asn Gly Ala Leu Tyr Phe Ser Glu Met Asp Glu Asp Gly Gly Met 
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175 



tec aag tac gag aac aac aag gec ggc gec aag tac ggc act ggc tac 576 
Ser Lys Tyr Glu Asn Asn Lys Ala Gly Ala Lys Tyr Gly Thr Gly Tyr 

180 185 190 

tgc gac acg cag tgc ccc cac gac gtc aag ttc ate aac ggc gag gec 624 
Cys Asp Thr Gin Cys Pro His Asp Val Lys Phe lie Asn Gly Glu Ala 
195 200 205 

aac att etc aac tgg acc aag tec gag acc gac gtc aac gec ggc act 672 
Asn lie Leu Asn Trp Thr Lys Ser Glu Thr Asp Val Asn Ala Gly Thr 
210 215 220 

ggc caa tac ggc tec tgc tgc aac gag atg gat ate tgg gag gec aac 72 0 

Gly Gin Tyr Gly Ser Cys Cys Asn Glu Met Asp lie Trp Glu Ala Asn 
225 230 235 240 

teg cag gec acc gee gtc act ccc cac gtc tgc aac gee gat gtc ate 768 
Ser Gin Ala Thr Ala Val Thr Pro His Val Cys Asn Ala Asp Val lie 

245 250 255 

ggc cag gtc cgt tgc aac ggc acc gac tgc ggt gac ggc gac aac cgc 816 
Gly Gin Val Arg Cys Asn Gly Thr Asp Cys Gly Asp Gly Asp Asn Arg 

260 265 270 

tac ggc ggc gtc tgc gac aag gat ggc tgc gac tac aac ccc tac cgc 8 64 

Tyr Gly Gly Val Cys Asp Lys Asp Gly Cys Asp Tyr Asn Pro Tyr Arg 
275 280 285 

at 9 ggc aac gag teg ttc tac ggc tec aac ggc age acc ate gac acc 912 
Met Gly Asn Glu Ser Phe Tyr Gly Ser Asn Gly Ser Thr lie Asp Thr 
290 295 300 

act gee aag ttc acc gtc att acg cag ttc ate acc teg gac aac act 960 
Thr Ala Lys Phe Thr Val He Thr Gin Phe He Thr Ser Asp Asn Thr 
305 310 315 320 

teg act ggc gac etc gtt gag ate cgc cgc aag tac gtc cag gac ggc 1008 
Ser Thr Gly Asp Leu Val Glu He Arg Arg Lys Tyr Val Gin Asp Gly 

325 330 335 

acc gtc ate gag aac teg ttc gee gac tac gac acc ctg gec acg ttc 1056 
Thr Val He Glu Asn Ser Phe Ala Asp Tyr Asp Thr Leu Ala Thr Phe 

340 345 350 

aac tec ate teg gac gac ttc tgc gac gee cag aag acg etc ttc ggc 1104 
Asn Ser He Ser Asp Asp Phe Cys Asp Ala Gin Lys Thr Leu Phe Gly 
355 360 365 

gac gag aac gac ttc aag acc aag ggc ggc att gee cgc atg ggc gag 1152 
Asp Glu Asn Asp Phe Lys Thr Lys Gly Gly He Ala Arg Met Gly Glu 
370 375 380 

tec ttc gag cgc ggc atg gtc etc gtc atg age ate tgg gat gac cac 1200 
Ser Phe Glu Arg Gly Met Val Leu Val Met Ser He Trp Asp Asp His 
385 390 395 400 

gcg gec aac gee etc tgg etc gac teg acc tac ccc gtc gac ggc gac 1248 
Ala Ala Asn Ala Leu Trp Leu Asp Ser Thr Tyr Pro Val Asp Gly Asp 

405 410 415 
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gcg acc aag cct ggc ate 
Ala Thr Lys Pro Gly lie 

420 

gtt ccc gec gac gtc gag 
Val Pro Ala Asp Val Glu 
435 

tec aac att cgc tac gga 
Ser Asn lie Arg Tyr Gly 
450 

tag 



aag cgc ggc cct tgc ggc 
Lys Arg Gly Pro Cys Gly 
425 

teg gag teg ccc gat teg 
Ser Glu Ser Pro Asp Ser 
440 

gac att ggc tec acc ttc 
Asp lie Gly Ser Thr Phe 
455 460 



acc gac act ggt 12 96 

Thr Asp Thr Gly 
430 

acc gtc ate tac 1344 

Thr Val He Tyr 

445 

aac gee acc get 1392 
Asn Ala Thr Ala 



1395 



<210> 50 
<211> 464 
<212> PRT 

<213> Poitrasia circinans 
<400> 50 

Met His Gin Thr Ser Val Leu Ser Ser Leu Ser Leu Leu Leu Ala Ala 
15 10 15 

Ser Gly Ala Gin Gin Val Gly Thr Gin Asn Ala Glu Thr His Pro Ser 

20 25 30 

Leu Thr Thr Gin Lys Cys Thr Thr Asp Gly Gly Cys Thr Asp Gin Ser 
35 40 45 

Thr Ala He Val Leu Asp Ala Asn Trp Arg Trp Leu His Thr Thr Glu 
50 55 60 

Gly Tyr Thr Asn Cys Tyr Thr Gly Gin Glu Trp Asp Thr Asp He Cys 
65 70 75 80 

Ser Ser Pro Glu Ala Cys Ala Thr Gly Cys Ala Leu Asp Gly Ala Asp 

85 90 95 

Tyr Glu Gly Thr Tyr Gly He Thr Thr Asp Gly Asn Ala Leu Ser Met 

100 105 110 

Lys Phe Val Thr Gin Gly Ser Gin Lys Asn Val Gly Gly Arg Val Tyr 
115 120 125 

Leu Leu Ala Pro Asp Ser Glu Asp Ala Tyr Glu Leu Phe Lys Leu Lys 
130 135 140 

Asn Gin Glu Phe Thr Phe Asp Val Asp Val Ser Asp Leu Pro Cys Gly 
145 150 155 160 

Leu Asn Gly Ala Leu Tyr Phe Ser Glu Met Asp Glu Asp Gly Gly Met 

165 170 175 

Ser Lys Tyr Glu Asn Asn Lys Ala Gly Ala Lys Tyr Gly Thr Gly Tyr 

180 185 190 

Cys Asp Thr Gin Cys Pro His Asp Val Lys Phe He Asn Gly Glu Ala 
195 200 205 
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Asn He Leu Asn Trp Thr Lys Ser Glu Thr Asp Val Asn Ala Gly Thr 
210 215 220 

Gly Gin Tyr Gly Ser Cys Cys Asn Glu Met Asp He Trp Glu Ala Asn 
225 230 235 240 

Ser Gin Ala Thr Ala Val Thr Pro His Val Cys Asn Ala Asp Val He 

245 250 255 

Gly Gin Val Arg Cys Asn Gly Thr Asp Cys Gly Asp Gly Asp Asn Arg 

260 265 270 

Tyr Gly Gly Val Cys Asp Lys Asp Gly Cys Asp Tyr Asn Pro Tyr Arg 
275 280 285 

Met Gly Asn Glu Ser Phe Tyr Gly Ser Asn Gly Ser Thr He Asp Thr 
290 295 300 

Thr Ala Lys Phe Thr Val He Thr Gin Phe He Thr Ser Asp Asn Thr 
305 310 315 320 

Ser Thr Gly Asp Leu Val Glu He Arg Arg Lys Tyr Val Gin Asp Gly 

325 330 335 

Thr Val He Glu Asn Ser Phe Ala Asp Tyr Asp Thr Leu Ala Thr Phe 

340 345 350 

Asn Ser He Ser Asp Asp Phe Cys Asp Ala Gin Lys Thr Leu Phe Gly 
355 360 365 

Asp Glu Asn Asp Phe Lys Thr Lys Gly Gly He Ala Arg Met Gly Glu 
370 375 380 

Ser Phe Glu Arg Gly Met Val Leu Val Met Ser He Trp Asp Asp His 
385 390 395 400 

Ala Ala Asn Ala Leu Trp Leu Asp Ser Thr Tyr Pro Val Asp Gly Asp 

405 410 415 

Ala Thr Lys Pro Gly He Lys Arg Gly Pro Cys Gly Thr Asp Thr Gly 

420 425 430 

Val Pro Ala Asp Val Glu Ser Glu Ser Pro Asp Ser Thr Val He Tyr 
435 440 445 

Ser Asn He Arg Tyr Gly Asp He Gly Ser Thr Phe Asn Ala Thr Ala 
450 455 460 



<210> 51 

<211> 1383 

<212> DNA 

<213> Coprinus cinereus 
<220> 

<221> CDS 

<222> (1) . . (1383) 

<223> 

<400> 51 



atg ttc aag aaa gtc gcc etc acc get etc tgc ttc etc gec gtc gca 48 
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Met Phe Lys Lys Val Ala Leu Thr Ala Leu Cys Phe Leu Ala Val Ala 
15 10 15 



cag gcc caa cag gtc ggt cgc gaa gtc get gaa aac cac ccc cgt etc 
Gin Ala Gin Gin Val Gly Arg Glu Val Ala Glu Asn His Pro Arg Leu 

20 25 30 



96 



ccg tgg cag cgt tgc act cgc aac ggc gga tgc cag act gtc tec aac 144 
Pro Trp Gin Arg Cys Thr Arg Asn Gly Gly Cys Gin Thr Val Ser Asn 
35 40 45 

ggt cag gtc gtc etc gac gcc aac tgg cga tgg etc cac gtc ace gac 192 
Gly Gin Val Val Leu Asp Ala Asn Trp Arg Trp Leu His Val Thr Asp 
50 55 60 

ggc tac acc aac tgc tac acc ggt aac tec tgg aac age ace gtc tgc 240 
Gly Tyr Thr Asn Cys Tyr Thr Gly Asn Ser Trp Asn Ser Thr Val Cys 
65 70 75 80 

tec gac ccc acc acc tgc get cag cga tgc get etc gag ggt gcc aac 288 
Ser Asp Pro Thr Thr Cys Ala Gin Arg Cys Ala Leu Glu Gly Ala Asn 

85 90 95 

tac cag caa acc tac ggt ate acc acc aac gga gac gcc etc acc ate 336 
Tyr Gin Gin Thr Tyr Gly lie Thr Thr Asn Gly Asp Ala Leu Thr lie 

100 105 110 

aag ttc etc acc cga tec caa caa acc aac gtc ggt get cgt gtc tac 384 
Lys Phe Leu Thr Arg Ser Gin Gin Thr Asn Val Gly Ala Arg Val Tyr 
115 120 125 

etc atg gag aac gag aac cga tac cag atg ttc aac etc etc aac aag 432 
Leu Met Glu Asn Glu Asn Arg Tyr Gin Met Phe Asn Leu Leu Asn Lys 
130 135 140 

gag ttc acc ttc gac gtt gac gtc tec aag gtt cct tgc ggt ate aac 480 
Glu Phe Thr Phe Asp Val Asp Val Ser Lys Val Pro Cys Gly lie Asn 
145 150 155 160 

ggt gcc etc tac ttc ate cag atg gac gcc gat ggt ggt atg age aag 52 8 

Gly Ala Leu Tyr Phe lie Gin Met Asp Ala Asp Gly Gly Met Ser Lys 

165 170 175 

caa ccc aac aac agg get ggt get aag tac ggt acc ggc tac tgc gac 576 
Gin Pro Asn Asn Arg Ala Gly Ala Lys Tyr Gly Thr Gly Tyr Cys Asp 

180 185 190 

tct cag tgc ccc cgt gac ate aag ttc att gac ggc gtg gcc aac age 624 
Ser Gin Cys Pro Arg Asp lie Lys Phe lie Asp Gly Val Ala Asn Ser 
195 200 205 

gcc gac tgg act cca tec gag acc gat ccc aat gcc gga agg ggt cgc 672 
Ala Asp Trp Thr Pro Ser Glu Thr Asp Pro Asn Ala Gly Arg Gly Arg 
210 215 220 

tac ggc att tgc tgc gcc gag atg gat ate tgg gag gcc aac tec ate 72 0 

Tyr Gly lie Cys Cys Ala Glu Met Asp lie Trp Glu Ala Asn Ser lie 
225 230 235 240 

tec aat gcc tac acc ccc cac cct tgc cga acc cag aac gat ggt ggc 768 
Ser Asn Ala Tyr Thr Pro His Pro Cys Arg Thr Gin Asn Asp Gly Gly 

245 250 255 
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tac cag cgc tgc gag ggc cgc gac tgc aac cag cct cgc tat gag ggt 816 
Tyr Gin Arg Cys Glu Gly Arg Asp Cys Asn Gin Pro Arg Tyr Glu Gly 

260 265 270 

ctt tgc gat cct gat ggc tgt gac tac aac ccc ttc cgc atg ggt aac 864 
Leu Cys Asp Pro Asp Gly Cys Asp Tyr Asn Pro Phe Arg Met Gly Asn 
275 280 285 

aag gac ttc tac gga ccc gga aag acc gtc gac acc aac agg aag atg 912 
Lys Asp Phe Tyr Gly Pro Gly Lys Thr Val Asp Thr Asn Arg Lys Met 
290 295 300 

acc gtc gtc acc caa ttc ate acc cac gac aac acc gac act ggc acc 960 
Thr Val Val Thr Gin Phe lie Thr His Asp Asn Thr Asp Thr Gly Thr 
305 310 315 320 

etc gtt gac ate cgc cgc etc tac gtt caa gac ggc cgt gtc att gee 1008 
Leu Val Asp lie Arg Arg Leu Tyr Val Gin Asp Gly Arg Val lie Ala 

325 330 335 

aac cct ccc acc aac ttc ccc ggt etc atg ccc gee cac gac tec ate 1056 
Asn Pro Pro Thr Asn Phe Pro Gly Leu Met Pro Ala His Asp Ser lie 

340 345 350 

acc gag cag ttc tgc act gac cag aag aac etc ttc ggc gac tac age 1104 
Thr Glu Gin Phe Cys Thr Asp Gin Lys Asn Leu Phe Gly Asp Tyr Ser 
355 360 365 

age ttc get cgt gac ggt ggt etc get cac atg ggt cgc tec etc gee 1152 
Ser Phe Ala Arg Asp Gly Gly Leu Ala His Met Gly Arg Ser Leu Ala 

370 375 380 

-j 

sag ggt cac gtc etc get etc tec ate tgg aac gac cac ggt gee cac 12 00 

Lys Gly His Val Leu Ala Leu Ser lie Trp Asn Asp His Gly Ala His 
385 390 395 400 

atg ttg tgg etc gac tec aac tac ccc acc gac get gac ccc aac aag 1248 
Met Leu Trp Leu Asp Ser Asn Tyr Pro Thr Asp Ala Asp Pro Asn Lys 

405 410 415 

ccc ggt att get cgt ggt acc tgc ccg acc act ggt ggc acc ccc cgt 1296 
Pro Gly lie Ala Arg Gly Thr Cys Pro Thr Thr Gly Gly Thr Pro Arg 

420 425 430 

gaa acc gaa caa aac cac cct gat gee cag gtc ate ttc tec aac att 1344 
Glu Thr Glu Gin Asn His Pro Asp Ala Gin Val lie Phe Ser Asn lie 
435 440 445 



aaa ttc ggt gac ate ggc teg act ttc tct ggt tac taa 
Lys Phe Gly Asp lie Gly Ser Thr Phe Ser Gly Tyr 
450 455 ^ 460 



1383 



<210> 52 

<211> 460 

<212> PRT 

<213> Coprinus cinereus 

<400> 52 



Met Phe Lys Lys Val Ala Leu Thr Ala Leu Cys Phe Leu Ala Val Ala 

67 



WO 03/000941 



PCT/DK02/00429 



1 

Gin Ala Gin Gin 

20 

Pro Trp Gin Arg 
35 

Gly Gin Val Val 
50 

Gly Tyr Thr Asn 
65 

Ser Asp Pro Thr 



Tyr Gin Gin Thr 

100 

Lys Phe Leu Thr 
115 

Leu Met Glu Asn 
130 

Glu Phe Thr Phe 
145 

Gly Ala Leu Tyr 



Gin Pro Asn Asn 

180 

Ser Gin Cys Pro 
195 

Ala Asp Trp Thr 
210 

Tyr Gly lie Cys 
225 

Ser Asn Ala Tyr 



Tyr Gin Arg Cys 

260 

Leu Cys Asp Pro 
275 

Lys Asp Phe Tyr 
290 

Thr Val Val Thr 
305 

Leu Val Asp lie 



5 

Val Gly Arg Glu 



Cys Thr Arg Asn 

40 

Leu Asp Ala Asn 
55 

Cys Tyr Thr Gly 
70 

Thr Cys Ala Gin 
85 

Tyr Gly lie Thr 



Arg Ser Gin Gin 

120 

Glu Asn Arg Tyr 
135 

Asp Val Asp Val 
150 

Phe lie Gin Met 
165 

Arg Ala Gly Ala 



Arg Asp lie Lys 

200 

Pro Ser Glu Thr 
215 

Cys Ala Glu Met 
230 

Thr Pro His Pro 
245 

Glu Gly Arg Asp 



Asp Gly Cys Asp 

280 

Gly Pro Gly Lys 
295 

Gin Phe lie Thr 
310 

Arg Arg Leu Tyr 
325 



Val Ala Glu Asn 
25 

Gly Gly Cys Gin 



Trp Arg Trp Leu 

60 

Asn Ser Trp Asn 
75 

Arg Cys Ala Leu 
90 

Thr Asn Gly Asp 
105 

Thr Asn Val Gly 



Gin Met Phe Asn 

140 

Ser Lys Val Pro 
155 

Asp Ala Asp Gly 
170 

Lys Tyr Gly Thr 
185 

Phe lie Asp Gly 



Asp Pro Asn Ala 

220 

Asp lie Trp Glu 
235 

Cys Arg Thr Gin 
250 

Cys Asn Gin Pro 
265 

Tyr Asn Pro Phe 



Thr Val Asp Thr 

300 

His Asp Asn Thr 
315 

Val Gin Asp Gly 
330 



His Pro Arg Leu 
30 

Thr Val Ser Asn 
45 

His Val Thr Asp 



Ser Thr Val Cys 

80 

Glu Gly Ala Asn 
95 

Ala Leu Thr lie 
110 

Ala Arg Val Tyr 
125 

Leu Leu Asn Lys 



Cys Gly lie Asn 

160 

Gly Met Ser Lys 
175 

Gly Tyr Cys Asp 
190 

Val Ala Asn Ser 
205 

Gly Arg Gly Arg 



Ala Asn Ser lie 

240 

Asn Asp Gly Gly 
255 

Arg Tyr Glu Gly 
270 

Arg Met Gly Asn 
285 

Asn Arg Lys Met 



Asp Thr Gly Thr 

320 

Arg Val lie Ala 
335 
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Asn Pro Pro Thr 

340 

Thr Glu Gin Phe 
355 

Ser Phe Ala Arg 
370 

Lys Gly His Val 
385 

Met Leu Trp Leu 



Pro Gly lie Ala 

420 

Glu Thr Glu Gin 
435 

Lys Phe Gly Asp 
450 



Asn Phe Pro Gly 



Cys Thr Asp Gin 

360 

Asp Gly Gly Leu 
375 

Leu Ala Leu Ser 
390 

Asp Ser Asn Tyr 
405 

Arg Gly Thr Cys 



Asn His Pro Asp 

440 

lie Gly Ser Thr 
455 



Leu Met Pro Ala 
345 

Lys Asn Leu Phe 



Ala His Met Gly 

380 

lie Trp Asn Asp 
395 

Pro Thr Asp Ala 
410 

Pro Thr Thr Gly 
425 

Ala Gin Val lie 



Phe Ser Gly Tyr 

460 



His Asp Ser lie 
350 

Gly Asp Tyr Ser 
365 

Arg Ser Leu Ala 



His Gly Ala His 

400 

Asp Pro Asn Lys 
415 

Gly Thr Pro Arg 
430 

Phe Ser Asn lie 
445 



<210> 53 

<211> 1353 

<212> DNA 

<213> Acremonium sp . 
<220> 

<221> CDS 

<222> (1) . . (1353) 

<223> 

<400> 53 

atg atg aag cag tat ctt cag tac ctg gcg gcg get ctg ccc eta atg 48 

Met Met Lys Gin Tyr Leu Gin Tyr Leu Ala Ala Ala Leu Pro Leu Met 
15 10 15 

ggc ctt gec gcg ggc cag caa gec ggc egg gag acg ccc gaa aac cac 96 
Gly Leu Ala Ala Gly Gin Gin Ala Gly Arg Glu Thr Pro Glu Asn His 

20 25 30 

ccc egg etc ace tgg aag aag tgc teg ggc cag ggg tec tgc cag ace 144 
Pro Arg Leu Thr Trp Lys Lys Cys Ser Gly Gin Gly Ser Cys Gin Thr 
35 40 45 

gtc aac ggc gag gtc gtc att gat gee aac tgg cgc tgg etc cac gac 192 
Val Asn Gly Glu Val Val lie Asp Ala Asn Trp Arg Trp Leu His Asp 
50 55 60 

tec aac atg cag aac tgc tac gac ggc aac cag tgg ace age gcg tgc 240 
Ser Asn Met Gin Asn Cys Tyr Asp Gly Asn Gin Trp Thr Ser Ala Cys 
65 70 75 80 

age teg gec acc gac tgc gec tec aag tgc tac ate gag ggt gee gac 288 
Ser Ser Ala Thr Asp Cys Ala Ser Lys Cys Tyr lie Glu Gly Ala Asp 

85 90 95 



tac ggc agg acc tac ggc get teg acg age ggc gac tec etc acg etc 336 
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Tyr Gly Arg Thr Tyr Gly Ala Ser Thr Ser Gly Asp Ser Leu Thr Leu 

100 105 110 

aag ttt gtc act cag cac gag tac ggt acc aac ate ggc teg cgc ttc 384 
Lys Phe Val Thr Gin His Glu Tyr Gly Thr Asn lie Gly Ser Arg Phe 
115 120 125 

tac ctg atg age age ccg acc egg tac cag atg ttc acc etc atg aac 432 
Tyr Leu Met Ser Ser Pro Thr Arg Tyr Gin Met Phe Thr Leu Met Asn 
130 135 140 

aac gaa ttt get ttc gat gtc gac etc teg acc gtc gag tgc ggc ate 480 
Asn Glu Phe Ala Phe Asp Val Asp Leu Ser Thr Val Glu Cys Gly lie 
145 150 155 160 

aac age gee ctg tac ttc gtc gee atg gag gag gac ggc ggc atg gee 528 
Asn Ser Ala Leu Tyr Phe Val Ala Met Glu Glu Asp Gly Gly Met Ala 

165 170 175 

age tac ccc acc aac aag gee gga gee aag tac ggc acg ggt tac tgc 576 
Ser Tyr Pro Thr Asn Lys Ala Gly Ala Lys Tyr Gly Thr Gly Tyr Cys 

180 185 190 

gac gec caa tgc gee cgt gat etc aag ttc gtc ggc ggc aag gec aac 624 
Asp Ala Gin Cys Ala Arg Asp Leu Lys Phe Val Gly Gly Lys Ala Asn 
195 200 205 

att gag ggc tgg agg ccg tec acc aac gac gcg aac gee ggc gtc ggc 672 
lie Glu Gly Trp Arg Pro Ser Thr Asn Asp Ala Asn Ala Gly Val Gly 
210 215 220 

ccg atg ggc ggc tgc tgc gcg gaa ate gat gtt tgg gag tec aac gee 720 
Pro Met Gly Gly Cys Cys Ala Glu He Asp Val Trp Glu Ser Asn Ala 
225 230 235 240 

cac get ttt gec ttc acg ccg cac gcg tgc gag aac aac aac tac cac 768 
His Ala Phe Ala Phe Thr Pro His Ala Cys Glu Asn Asn Asn Tyr His 

245 250 255 

ate tgc gag acc tec aac tgc ggc ggt acc tac tec gac gac cgc ttc 816 
He Cys Glu Thr Ser Asn Cys Gly Gly Thr Tyr Ser Asp Asp Arg Phe 

260 265 270 

gee ggc etc tgc gac gee aac ggc tgc gac tac aac ccg tac cgc atg 864 
Ala Gly Leu Cys Asp Ala Asn Gly Cys Asp Tyr Asn Pro Tyr Arg Met 
275 280 285 

ggc aac ccc gac ttc tac ggc aag ggc aag act ctt gac acc teg egg 912 
Gly Asn Pro Asp Phe Tyr Gly Lys Gly Lys Thr Leu Asp Thr Ser Arg 
290 295 300 

aag ttc acc gtc gtc acc cgc ttc cag gag aac gac etc teg cag tac 960 
Lys Phe Thr Val Val Thr Arg Phe Gin Glu Asn Asp Leu Ser Gin Tyr 
305 310 315 320 

ttc ate cag gac ggc cgc aag ate gag ate ccg ccc ccg acc tgg gac 1008 
Phe He Gin Asp Gly Arg Lys He Glu He Pro Pro Pro Thr Trp Asp 

325 330 335 

ggc etc ccg aag age age cac ate acg ccc gag ctg tgc gcg acc cag 1056 
Gly Leu Pro Lys Ser Ser His He Thr Pro Glu Leu Cys Ala Thr Gin 

340 345 350 
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ttc gac gtc ttc gac gac cgc aac cgc ttc gag gag gtc ggc ggc ttc 1104 
Phe Asp Val Phe Asp Asp Arg Asn Arg Phe Glu Glu Val Gly Gly Phe 
355 360 365 

ccc gcc etc aac gec get etc cgc ate ccc atg gtc ctt gtc atg tec 1152 
Pro Ala Leu Asn Ala Ala Leu Arg lie Pro Met Val Leu Val Met Ser 
370 375 380 

ate tgg gac gac cac tac gcc aac atg etc tgg etc gac tec gtc tac 1200 
lie Trp Asp Asp His Tyr Ala Asn Met Leu Trp Leu Asp Ser Val Tyr 
385 390 395 400 

ccg ccc gag aag gag ggc ace ccc ggc gcc gag cgt ggc cct tgc ccc 1248 
Pro Pro Glu Lys Glu Gly Thr Pro Gly Ala Glu Arg Gly Pro Cys Pro 

405 410 415 

cag ace tct ggt gtc ccc gcc gaa gtc gag gcc cag tac ccc aac gcc 1296 
Gin Thr Ser Gly Val Pro Ala Glu Val Glu Ala Gin Tyr Pro Asn Ala 

420 425 430 

aag gtc gtc tgg tec aac ate cgc ttc ggc ccc ate ggc teg ace tac 1344 
Lys Val Val Trp Ser Asn lie Arg Phe Gly Pro lie Gly Ser Thr Tyr 
435 440 445 



aac atg taa 
Asn Met 
450 



1353 



<210> 54 
<211> 450 
<212> PRT 

<213> Acremonium sp . 
<400> 54 

Met Met Lys Gin Tyr Leu Gin Tyr Leu Ala Ala Ala Leu Pro Leu Met 
15 10 15 

Gly Leu Ala Ala Gly Gin Gin Ala Gly Arg Glu Thr Pro Glu Asn His 

20 25 30 

Pro Arg Leu Thr Trp Lys Lys Cys Ser Gly Gin Gly Ser Cys Gin Thr 
35 40 45 

Val Asn Gly Glu Val Val lie Asp Ala Asn Trp Arg Trp Leu His Asp 
50 55 60 

Ser Asn Met Gin Asn Cys Tyr Asp Gly Asn Gin Trp Thr Ser Ala Cys 
65 70 75 80 

Ser Ser Ala Thr Asp Cys Ala Ser Lys Cys Tyr He Glu Gly Ala Asp 

85 90 95 

Tyr Gly Arg Thr Tyr Gly Ala Ser Thr Ser Gly Asp Ser Leu Thr Leu 

100 105 110 

Lys Phe Val Thr Gin His Glu Tyr Gly Thr Asn He Gly Ser Arg Phe 
115 120 125 

Tyr Leu Met Ser Ser Pro Thr Arg Tyr Gin Met Phe Thr Leu Met Asn 
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Asn Glu Phe Ala 
145 

Asn Ser Ala Leu 



Ser Tyr Pro Thr 

180 

Asp Ala Gin Cys 
195 

lie Glu Gly Trp 
210 

Pro Met Gly Gly 
225 

His Ala Phe Ala 



lie Cys Glu Thr 

260 

Ala Gly Leu Cys 
275 

Gly Asn Pro Asp 
290 

Lys Phe Thr Val 
305 

Phe lie Gin Asp 



Gly Leu Pro Lys 

340 

Phe Asp Val Phe 
355 

Pro Ala Leu Asn 
370 

lie Trp Asp Asp 
385 

Pro Pro Glu Lys 



Gin Thr Ser Gly 

420 

Lys Val Val Trp 
435 

Asn Met 
450 



Phe Asp Val Asp 
150 

Tyr Phe Val Ala 
165 

Asn Lys Ala Gly 



Ala Arg Asp Leu 

200 

Arg Pro Ser Thr 
215 

Cys Cys Ala Glu 
230 

Phe Thr Pro His 
245 

Ser Asn Cys Gly 



Asp Ala Asn Gly 

280 

Phe Tyr Gly Lys 
295 

Val Thr Arg Phe 
310 

Gly Arg Lys lie 
325 

Ser Ser His lie 



Asp Asp Arg Asn 

360 

Ala Ala Leu Arg 
375 

His Tyr Ala Asn 
390 

Glu Gly Thr Pro 
405 

Val Pro Ala Glu 



Ser Asn lie Arg 

440 



Leu Ser Thr Val 
155 

Met Glu Glu Asp 
170 

Ala Lys Tyr Gly 
185 

Lys Phe Val Gly 



Asn Asp Ala Asn 

220 

lie Asp Val Trp 
235 

Ala Cys Glu Asn 
250 

Gly Thr Tyr Ser 
265 

Cys Asp Tyr Asn 



Gly Lys Thr Leu 

300 

Gin Glu Asn Asp 
315 

Glu lie Pro Pro 
330 

Thr Pro Glu Leu 
345 

Arg Phe Glu Glu 



lie Pro Met Val 

380 

Met Leu Trp Leu 
395 

Gly Ala Glu Arg 
410 

Val Glu Ala Gin 
425 

Phe Gly Pro lie 



Glu Cys Gly lie 

160 

Gly Gly Met Ala 
175 

Thr Gly Tyr Cys 
190 

Gly Lys Ala Asn 
205 

Ala Gly Val Gly 



Glu Ser Asn Ala 

240 

Asn Asn Tyr His 
255 

Asp Asp Arg Phe 
270 

Pro Tyr Arg Met 
285 

Asp Thr Ser Arg 



Leu Ser Gin Tyr 

320 

Pro Thr Trp Asp 
335 

Cys Ala Thr Gin 
350 

Val Gly Gly Phe 
365 

Leu Val Met Ser 



Asp Ser Val Tyr 

400 

Gly Pro Cys Pro 
415 

Tyr Pro Asn Ala 
430 

Gly Ser Thr Tyr 
445 
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<210> 55 

<211> 1599 

<212> DNA 

<213> Chaetomidium pingtungium 
<220> 

<221> CDS 

<222> (1) . . (1599) 

<223> 

<400> 55 

atg ctg gcc tec acc ttc tec tac cgc atg tac aag acc gcg etc ate 48 

Met Leu Ala Ser Thr Phe Ser Tyr Arg Met Tyr Lys Thr Ala Leu lie 
1 5 10 15 

ctg gcc gcc ctt ctg ggc tct ggc cag get cag cag gtc ggt act tec 96 
Leu Ala Ala Leu Leu Gly Ser Gly Gin Ala Gin Gin Val Gly Thr Ser 

20 25 30 

cag gcg gaa gtg cat ccg tec atg acc tgg cag age tgc acg get ggc 144 
Gin Ala Glu Val His Pro Ser Met Thr Trp Gin Ser Cys Thr Ala Gly 
35 40 45 

ggc age tgc acc acc aac aac ggc aag gtg gtc ate gac gcg aac tgg 192 
Gly Ser Cys Thr Thr Asn Asn Gly Lys Val Val He Asp Ala Asn Trp 
50 55 60 

c 9t tgg gtg cac aaa gtc ggc gac tac acc aac tgc tac acc ggc aac 240 
Arg Trp Val His Lys Val Gly Asp Tyr Thr Asn Cys Tyr Thr Gly Asn 
65 70 75 80 

acc tgg gac acg act ate tgc cct gac gat gcg acc tgc gca tec aac 288 
Thr Trp Asp Thr Thr He Cys Pro Asp Asp Ala Thr Cys Ala Ser Asn 

85 90 95 

tgc gcc ctt gag ggt gcc aac tac gaa tec acc tat ggt gtg acc gcc 336 
Cys Ala Leu Glu Gly Ala Asn Tyr Glu Ser Thr Tyr Gly Val Thr Ala 

100 105 110 

age ggc aat tec etc cgc etc aac ttc gtc acc acc age cag cag aag 3 84 

Ser Gly Asn Ser Leu Arg Leu Asn Phe Val Thr Thr Ser Gin Gin Lys 
115 120 125 

aac att ggc teg cgt ctg tac atg atg aag gac gac teg acc tac gag 432 
Asn He Gly Ser Arg Leu Tyr Met Met Lys Asp Asp Ser Thr Tyr Glu 
130 135 140 

atg ttt aag ctg ctg aac cag gag ttc acc ttc gat gtc gat gtc tec 480 
Met Phe Lys Leu Leu Asn Gin Glu Phe Thr Phe Asp Val Asp Val Ser 
145 150 155 160 

aac etc ccc tgc ggt etc aac ggt get ctg tac ttt gtc gcc atg gac 528 
Asn Leu Pro Cys Gly Leu Asn Gly Ala Leu Tyr Phe Val Ala Met Asp 

165 170 175 

gcc ggc ggt ggc atg tec aag tac cca acc aac aag gcc ggt gee aag 576 
Ala Gly Gly Gly Met Ser Lys Tyr Pro Thr Asn Lys Ala Gly Ala Lys 

180 185 190 



tac ggt act gga tac tgt gac teg cag tgc cct cgc gac etc aag ttc 624 
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Tyr Gly Thr Gly Tyr Cys Asp Ser Gin Cys Pro Arg Asp Leu Lys Phe 
195 200 205 

ate aac ggt cag gec aac gtt gaa ggg tgg cag ccc tec tec aac gat 672 

lie Asn Gly Gin Ala Asn Val Glu Gly Trp Gin Pro Ser Ser Asn Asp 
210 215 220 

gee aat gcg ggt ace ggc aac cac ggg tec tgc tgc gcg gag atg gat 720 

Ala Asn Ala Gly Thr Gly Asn His Gly Ser Cys Cys Ala Glu Met Asp 

225 230 235 ~ 240 



ate tgg gag gee aac age ate tec acg gee ttc acc ccc cat ccg tgc 
lie Trp Glu Ala Asn Ser He Ser Thr Ala Phe Thr Pro His Pro Cys 



245 250 255 



768 



gac acg ccc ggc cag gtg atg tgc acc ggt gat gee tgc ggt ggc acc 816 
Asp Thr Pro Gly Gin Val Met Cys Thr Gly Asp Ala Cys Gly Gly Thr 

260 265 270 

tac age tec gac cgc tac ggc ggc acc tgc gac ccc gac gga tgt gat 864 
Tyr Ser Ser Asp Arg Tyr Gly Gly Thr Cys Asp Pro Asp Gly Cys Asp 
275 280 285 

ttc aac tec ttc cgc cag ggc aac aag acc ttc tac ggc cct ggc atg 912 
Phe Asn Ser Phe Arg Gin Gly Asn Lys Thr Phe Tyr Gly Pro Gly Met 
290 295 300 

acc gtc gac acc aag age aag ttt acc gtc gtc acc cag ttc ate acc 960 
Thr Val Asp Thr Lys Ser Lys Phe Thr Val Val Thr Gin Phe He Thr 
305 310 315 320 

gac gac ggc acc tec age ggc acc etc aag gag ate aag cgc ttc tac 1008 
Asp Asp Gly Thr Ser Ser Gly Thr Leu Lys Glu He Lys Arg Phe Tyr 

325 330 335 

gtg cag aac ggc aag gtg ate ccc aac teg gag teg acc tgg acc ggc 1056 
Val Gin Asn Gly Lys Val He Pro Asn Ser Glu Ser Thr Trp Thr Gly 

340 345 350 

gtc age ggc aac tec ate acc acc gag tac tgc acc gee cag aag age 1104 
Val Ser Gly Asn Ser He Thr Thr Glu Tyr Cys Thr Ala Gin Lys Ser 
355 360 365 

ctg ttc cag gac cag aac gtc ttc gaa aag cac ggc ggc etc gag ggc 1152 
Leu Phe Gin Asp Gin Asn Val Phe Glu Lys His Gly Gly Leu Glu Gly 
370 375 380 

atg ggt get gee etc gee cag ggc atg gtt etc gtc atg tec ctg tgg 1200 
Met Gly Ala Ala Leu Ala Gin Gly Met Val Leu Val Met Ser Leu Trp 
385 390 395 400 

gat gat cac teg gee aac atg etc tgg etc gac age aac tac ccg acc 1248 
Asp Asp His Ser Ala Asn Met Leu Trp Leu Asp Ser Asn Tyr Pro Thr 

405 410 415 

act gee tct tec acc act ccc ggc gtc gee cgt ggt acc tgc gac ate 1296 
Thr Ala Ser Ser Thr Thr Pro Gly Val Ala Arg Gly Thr Cys Asp He 

420 425 430 

tec tec ggc gtc cct gcg gat gtc gag gcg aac cac ccc gac gec tac 1344 
Ser Ser Gly Val Pro Ala Asp Val Glu Ala Asn His Pro Asp Ala Tyr 
435 440 445 
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gtc gtc tac 
Val Val Tyr 
450 

age ggt ggc 
Ser Gly Gly 
465 

cag cct act 
Gin Pro Thr 



gtc gca cag 
Val Ala Gin 



aca acc tgt 
Thr Thr Cys 
515 

tct cag tgc 
Ser Gin Cys 
530 



tec aac ate 
Ser Asn lie 



teg aac ccc 
Ser Asn Pro 
470 

acc acc acg 
Thr Thr Thr 
485 

cac tat ggc 
His Tyr Gly 
500 

gec age cct 
Ala Ser Pro 



ctg tag 
Leu 



aag gtc ggc 
Lys Val Gly 
455 

ggt ggc gga 
Gly Gly Gly 



acc acg get 
Thr Thr Ala 



cag tgt ggt 
Gin Cys Gly 
505 

tat acc tgc 
Tyr Thr Cys 
520 



ccc ate ggc 
Pro lie Gly 
460 

acc acc acg 
Thr Thr Thr 
475 

gga aac cct 
Gly Asn Pro 
490 

gga ate gga 
Gly lie Gly 



cag aag ctg 
Gin Lys Leu 



teg acc ttc 
Ser Thr Phe 



aca act acc 
Thr Thr Thr 



ggc ggc acc 
Gly Gly Thr 
495 

tgg acc gga 
Trp Thr Gly 
510 

aat gat tat 
Asn Asp Tyr 
525 



aac 1392 
Asn 



acc 1440 

Thr 

480 

gga 1488 
Gly 



ccc 1536 
Pro 



tac 1584 
Tyr 



1599 



<210> 56 

<211> 532 

<212> PRT 

<213> Chaetomidium pingtungium 



<400> 56 



Met Leu Ala Ser 
1 

Leu Ala Ala Leu 

20 

Gin Ala Glu Val 
35 

Gly Ser Cys Thr 
50 

Arg Trp Val His 
65 

Thr Trp Asp Thr 



Cys Ala Leu Glu 

100 

Ser Gly Asn Ser 
115 

Asn lie Gly Ser 
130 

Met Phe Lys Leu 
145 



Thr Phe Ser Tyr 
5 

Leu Gly Ser Gly 



His Pro Ser Met 

40 

Thr Asn Asn Gly 
55 

Lys Val Gly Asp 
70 

Thr lie Cys Pro 
85 

Gly Ala Asn Tyr 



Leu Arg Leu Asn 

120 

Arg Leu Tyr Met 
135 

Leu Asn Gin Glu 
150 



Arg Met Tyr Lys 
10 

Gin Ala Gin Gin 
25 

Thr Trp Gin Ser 



Lys Val Val lie 

60 

Tyr Thr Asn Cys 
75 

Asp Asp Ala Thr 
90 

Glu Ser Thr Tyr 
105 

Phe Val Thr Thr 



Met Lys Asp Asp 

140 

Phe Thr Phe Asp 
155 



Thr Ala Leu lie 

15 

Val Gly Thr Ser 
30 

Cys Thr Ala Gly 
45 

Asp Ala Asn Trp 



Tyr Thr Gly Asn 

80 

Cys Ala Ser Asn 
95 

Gly Val Thr Ala 
110 

Ser Gin Gin Lys 
125 

Ser Thr Tyr Glu 



Val Asp Val Ser 

160 



75 
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Asn Leu Pro Cys 



Ala Gly Gly Gly 

180 

Tyr Gly Thr Gly 
195 

lie Asn Gly Gin 
210 

Ala Asn Ala Gly 
225 

lie Trp Glu Ala 



Asp Thr Pro Gly 

260 

Tyr Ser Ser Asp 
275 

Phe Asn Ser Phe 
290 

Thr Val Asp Thr 
305 

Asp Asp Gly Thr 



Val Gin Asn Gly 

340 

Val Ser Gly Asn 
355 

Leu Phe Gin Asp 
370 

Met Gly Ala Ala 
385 

Asp Asp His Ser 



Thr Ala Ser Ser 

420 

Ser Ser Gly Val 
435 

Val Val Tyr Ser 
450 

Ser Gly Gly Ser 
465 

Gin Pro Thr Thr 



Gly Leu Asn Gly 
165 

Met Ser Lys Tyr 



Tyr Cys Asp Ser 

200 

Ala Asn Val Glu 
215 

Thr Gly Asn His 
230 

Asn Ser lie Ser 
245 

Gin Val Met Cys 



Arg Tyr Gly Gly 

280 

Arg Gin Gly Asn 
295 

Lys Ser Lys Phe 
310 

Ser Ser Gly Thr 
325 

Lys Val lie Pro 



Ser lie Thr Thr 

360 

Gin Asn Val Phe 
375 

Leu Ala Gin Gly 
390 

Ala Asn Met Leu 
405 

Thr Thr Pro Gly 



Pro Ala Asp Val 

440 

Asn lie. Lys Val 
455 

Asn Pro Gly Gly 
470 

Thr Thr Thr Thr 



Ala Leu Tyr Phe 
170 

Pro Thr Asn Lys 
185 

Gin Cys Pro Arg 



Gly Trp Gin Pro 

220 

Gly Ser Cys Cys 
235 

Thr Ala Phe Thr 
250 

Thr Gly Asp Ala 
265 

Thr Cys Asp Pro 



Lys Thr Phe Tyr 

300 

Thr Val Val Thr 
315 

Leu Lys Glu lie 
330 

Asn Ser Glu Ser 
345 

Glu Tyr Cys Thr 



Glu Lys His Gly 

380 

Met Val Leu Val 
395 

Trp Leu Asp Ser 
410 

Val Ala Arg Gly 
425 

Glu Ala Asn His 



Gly Pro lie Gly 

460 

Gly Thr Thr Thr 
475 

Ala Gly Asn Pro 

76 



Val Ala Met Asp 
175 

Ala Gly Ala Lys 
190 

Asp Leu Lys Phe 
205 

Ser Ser Asn Asp 



Ala Glu Met Asp 

240 

Pro His Pro Cys 
255 

Cys Gly Gly Thr 
270 

Asp Gly Cys Asp 
285 

Gly Pro Gly Met 



Gin Phe He Thr 

320 

Lys Arg Phe Tyr 
335 

Thr Trp Thr Gly 
350 

Ala Gin Lys Ser 
365 

Gly Leu Glu Gly 



Met Ser Leu Trp 

400 

Asn Tyr Pro Thr 
415 

Thr Cys Asp He 
430 

Pro Asp Ala Tyr 
445 

Ser Thr Phe Asn 



Thr Thr Thr Thr 

480 

Gly Gly Thr Gly 
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485 



490 
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495 



Val Ala Gin His Tyr Gly Gin Cys Gly Gly lie Gly Trp Thr Gly Pro 

500 505 510 

Thr Thr Cys Ala Ser Pro Tyr Thr Cys Gin Lys Leu Asn Asp Tyr Tyr 
515 520 525 

Ser Gin Cys Leu 
530 



<210> 


57 


<211> 


1383 


<212> 


DNA 


<2 13 > 


Sporotrichum pruinosum 


<220> 




<221 > 


CDS 


<222> 


(1) . . (1383) 


<223> 




<400> 


57 


atg ttc 


aag aaa gtc gcc etc acc 


Met Phe 


Lys Lys Val Ala Leu Thr 


1 


5 



get etc tgc ttc etc gcc gtc gca 48 
\la Leu Cys Phe Leu Ala Val Ala 
10 15 

cag gcc caa cag gtc ggt cgc gaa gtc get gaa aac cac ccc cgt etc 96 
Gin Ala Gin Gin Val Gly Arg Glu Val Ala Glu Asn His Pro Arg Leu 

20 25 30 

ccg tgg cag cgt tgc act cgc aac ggc gga tgc cag act gtc tct aac 144 
Pro Trp Gin Arg Cys Thr Arg Asn Gly Gly Cys Gin Thr Val Ser Asn 
35 40 45 

ggt cag gtc gtc etc gac gcc aac tgg cga tgg etc cac gtc acc gat 192 
Gly Gin Val Val Leu Asp Ala Asn Trp Arg Trp Leu His Val Thr Asp 
50 55 60 

ggc tac acc aac tgc tac acc ggt aac tec tgg aac age acc gtc tgc 24 0 

Gly Tyr Thr Asn Cys Tyr Thr Gly Asn Ser Trp Asn Ser Thr Val Cys 
65 70 75 80 

tec gac ccc acc acc tgc get cag cga tgc get etc gag ggt gcc aac 288 
Ser Asp Pro Thr Thr Cys Ala Gin Arg Cys Ala Leu Glu Gly Ala Asn 

85 90 95 

tac cag caa acc tac ggt ate acc acc aac gga gac gcc etc acc ate 336 
Tyr Gin Gin Thr Tyr Gly lie Thr Thr Asn Gly Asp Ala Leu Thr lie 

100 105 no 

aag ttc etc acc cga tec caa caa acc aac gtc ggt get cgt gtc tac 384 
Lys Phe Leu Thr Arg Ser Gin Gin Thr Asn Val Gly Ala Arg Val Tyr 
115 120 125 

etc atg gag aac gag aac cga tac cag atg ttc aac etc etc aac aag 432 
Leu Met Glu Asn Glu Asn Arg Tyr Gin Met Phe Asn Leu Leu Asn Lys 
130 135 140 

gag ttc acc ttc gac gtt gac gtc tec aag gtt cct tgc ggt ate aac 480 
Glu Phe Thr Phe Asp Val Asp Val Ser Lys Val Pro Cys Gly lie Asn 
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145 



150 



155 
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160 



ggt gcc etc tac ttc ate cag atg gac gec gat ggt ggt atg age aag 52 8 

Gly Ala Leu Tyr Phe lie Gin Met Asp Ala Asp Gly Gly Met Ser Lys 

165 170 175 

caa ccc aac aac agg get ggt get aag tac ggt ace ggc tac tgc gac 576 
Gin Pro Asn Asn Arg Ala Gly Ala Lys Tyr Gly Thr Gly Tyr Cys Asp 

180 185 190 

tct cag tgc ccc cgt gac ate aag ttc att gac ggc gtg gcc aac age 624 
Ser Gin Cys Pro Arg Asp lie Lys Phe lie Asp Gly Val Ala Asn Ser 
195 200 205 

gcc gac tgg act cca tec gag ace gat ccc aat gcc gga agg ggt cgc 672 
Ala Asp Trp Thr Pro Ser Glu Thr Asp Pro Asn Ala Gly Arg Gly Arg 
210 215 220 

tac ggc att tgc tgc gcc gag atg gat ate tgg gag gcc aac tec ate 720 
Tyr Gly lie Cys Cys Ala Glu Met Asp lie Trp Glu Ala Asn Ser He 
225 230 235 240 

tec aat gcc tac ace ccc cac cct tgc cga acc cag aac gat ggt ggc 768 
Ser Asn Ala Tyr Thr Pro His Pro Cys Arg Thr Gin Asn Asp Gly Gly 

245 250 255 

tac cag cgc tgc gag ggc cgc gac tgc aac cag cct cgc tat gag ggt 816 
Tyr Gin Arg Cys Glu Gly Arg Asp Cys Asn Gin Pro Arg Tyr Glu Gly 

260 265 270 

ctt tgc gat cct gat ggc tgt gac tac aac ccc ttc cgc atg ggt aac 864 
Leu Cys Asp Pro Asp Gly Cys Asp Tyr Asn Pro Phe Arg Met Gly Asn 
275 280 285 

aag gac ttc tac gga ccc gga aag acc ate gac acc aac agg aag atg 912 
Lys Asp Phe Tyr Gly Pro Gly Lys Thr He Asp Thr Asn Arg Lys Met 
290 295 300 

acc gtc gtc acc caa ttc ate acc cac gac aac acc gac act ggc acc 960 
Thr Val Val Thr Gin Phe He Thr His Asp Asn Thr Asp Thr Gly Thr 
305 310 315 320 

etc gtt gac ate cgc cgc etc tac gtt caa gac ggc cgt gtc att gcc 1008 
Leu Val Asp He Arg Arg Leu Tyr Val Gin Asp Gly Arg Val He Ala 

325 330 335 

aac cct ccc acc aac ttc ccc ggt etc atg ccc gcc cac gac tec ate 1056 
Asn Pro Pro Thr Asn Phe Pro Gly Leu Met Pro Ala His Asp Ser He 

340 345 350 

acc gag cag ttc tgc act gac cag aag aac etc ttc ggc gac tac age 1104 
Thr Glu Gin Phe Cys Thr Asp Gin Lys Asn Leu Phe Gly Asp Tyr Ser 
355 360 365 

age ttc get cgt gac ggt ggt etc get cac atg ggt cgc tec etc gcc 1152 
Ser Phe Ala Arg Asp Gly Gly Leu Ala His Met Gly Arg Ser Leu Ala 
370 375 380 

aag ggt cac gtc etc get etc tec ate tgg aac gac cac ggt gcc cac 1200 
Lys Gly His Val Leu Ala Leu Ser He Trp Asn Asp His Gly Ala His 
385 390 395 400 
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atg ttg tgg etc gac tec aac tac ccc acc gac get gac ccc aac aag 1248 
Met Leu Trp Leu Asp Ser Asn Tyr Pro Thr Asp Ala Asp Pro Asn Lys 

405 410 415 

ccc ggt att get cgt ggt acc tgc ccg acc act ggt ggc acc ccc cgt 1296 
Pro Gly lie Ala Arg Gly Thr Cys Pro Thr Thr Gly Gly Thr Pro Arg 

420 425 430 

gaa acc gaa caa aac cac cct gat gee cag gtc ate ttc tec aac att 1344 
Glu Thr Glu Gin Asn His Pro Asp Ala Gin Val lie Phe Ser Asn lie 
435 440 445 

aaa ttc ggt gac ate ggc teg act ttc tct ggt tac taa 1383 
Lys Phe Gly Asp lie Gly Ser Thr Phe Ser Gly Tyr 
450 455 460 



<210> 58 

<211> 460 

<212> PRT 

<213> Sporotrichum pruinosum 



<400> 58 



Met Phe Lys Lys 
1 

Gin Ala Gin Gin 

20 

Pro Trp Gin Arg 
35 

Gly Gin Val Val 
50 

Gly Tyr Thr Asn 
65 

Ser Asp Pro Thr 



Tyr Gin Gin Thr 

100 

Lys Phe Leu Thr 
115 

Leu Met Glu Asn 
130 

Glu Phe Thr Phe 
145 

Gly Ala Leu Tyr 



Gin Pro Asn Asn 

180 

Ser Gin Cys Pro 
195 



Val Ala Leu Thr 

5 

Val Gly Arg Glu 



Cys Thr Arg Asn 

40 

Leu Asp Ala Asn 
55 

Cys Tyr Thr Gly 
70 

Thr Cys Ala Gin 
85 

Tyr Gly lie Thr 



Arg Ser Gin Gin 

120 

Glu Asn Arg Tyr 
135 

Asp Val Asp Val 
150 

Phe lie Gin Met 
165 

Arg Ala Gly Ala 



Arg Asp lie Lys 

200 



Ala Leu Cys Phe 
10 

Val Ala Glu Asn 
25 

Gly Gly Cys Gin 



Trp Arg Trp Leu 

60 

Asn Ser Trp Asn 
75 

Arg Cys Ala Leu 
90 

Thr Asn Gly Asp 
105 

Thr Asn Val Gly 



Gin Met Phe Asn 

140 

Ser Lys Val Pro 
155 

Asp Ala Asp Gly 
170 

Lys Tyr Gly Thr 
185 

Phe lie Asp Gly 
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Leu Ala Val Ala 
15 

His Pro Arg Leu 
30 

Thr Val Ser Asn 
45 

His Val Thr Asp 



Ser Thr Val Cys 

80 

Glu Gly Ala Asn 
95 

Ala Leu Thr He 
110 

Ala Arg Val Tyr 
125 

Leu Leu Asn Lys 



Cys Gly He Asn 

160 

Gly Met Ser Lys 
175 

Gly Tyr Cys Asp 
190 

Val Ala Asn Ser 
205 
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Ala Asp Trp Thr Pro 
210 

Tyr Gly lie Cys Cys 
225 

Ser Asn Ala Tyr Thr 

245 

Tyr Gin Arg Cys Glu 

260 

Leu Cys Asp Pro Asp 
275 

Lys Asp Phe Tyr Gly 
290 

Thr Val Val Thr Gin 
305 

Leu Val Asp lie Arg 

325 

Asn Pro Pro Thr Asn 

340 

Thr Glu Gin Phe Cys 
355 

Ser Phe Ala Arg Asp 
370 

Lys Gly His Val Leu 
385 

Met Leu Trp Leu Asp 

405 

Pro Gly lie Ala Arg 

420 

Glu Thr Glu Gin Asn 
435 

Lys Phe Gly Asp lie 
450 



Ser Glu Thr Asp Pro Asn 
215 

Ala Glu Met Asp lie Trp 
230 235 

Pro His Pro Cys Arg Thr 

250 

Gly Arg Asp Cys Asn Gin 

265 

Gly Cys Asp Tyr Asn Pro 
280 

Pro Gly Lys Thr lie Asp 
295 

Phe lie Thr His Asp Asn 
310 315 

Arg Leu Tyr Val Gin Asp 

330 

Phe Pro Gly Leu Met Pro 

345 

Thr Asp Gin Lys Asn Leu 
360 

Gly Gly Leu Ala His Met 
375 

Ala Leu Ser lie Trp Asn 
390 395 

Ser Asn Tyr Pro Thr Asp 

410 

Gly Thr Cys Pro Thr Thr 

425 

His Pro Asp Ala Gin Val 
440 

Gly Ser Thr Phe Ser Gly 
455 



Ala Gly Arg Gly Arg 
220 

Glu Ala Asn Ser lie 

240 

Gin Asn Asp Gly Gly 

255 

Pro Arg Tyr Glu Gly 
270 

Phe Arg Met Gly Asn 
285 

Thr Asn Arg Lys Met 
300 

Thr Asp Thr Gly Thr 

320 

Gly Arg Val lie Ala 

335 

Ala His Asp Ser lie 
350 

Phe Gly Asp Tyr Ser 
365 

Gly Arg Ser Leu Ala 
380 

Asp His Gly Ala His 

400 

Ala Asp Pro Asn Lys 

415 

Gly Gly Thr Pro Arg 
430 

He Phe Ser Asn He 
445 

Tyr 
460 



<210> 59 

<211> 1578 

<212> DNA 

<213> Scytalidium thermophilum 
<220> 

<221> CDS 

<222> (1) . . (1578) 

<223> 

<400> 59 
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atg cgt acc gcc aag ttc gcc acc etc gec gec ctt gtg gec teg gec 48 
Met Arg Thr Ala Lys Phe Ala Thr Leu Ala Ala Leu Val Ala Ser Ala 
15 10 15 

gcc gcc cag cag gcg tgc agt etc acc acc gag agg cac cct tec etc 96 
Ala Ala Gin Gin Ala Cys Ser Leu Thr Thr Glu Arg His Pro Ser Leu 

20 25 30 

tct tgg aag aag tgc acc gcc ggc ggc cag tgc cag acc gtc cag get 144 
Ser Trp Lys Lys Cys Thr Ala Gly Gly Gin Cys Gin Thr Val Gin Ala 
35 40 45 

tec ate act etc gac tec aac tgg cgc tgg act cac cag gtg tct ggc 192 
Ser lie Thr Leu Asp Ser Asn Trp Arg Trp Thr His Gin Val Ser Gly 
50 55 60 

tec acc aac tgc tac acg ggc aac aag tgg gat act age ate tgc act 240 
Ser Thr Asn Cys Tyr Thr Gly Asn Lys Trp Asp Thr Ser lie Cys Thr 
65 70 75 80 

gat gcc aag teg tgc get cag aac tgc tgc gtc gat ggt gcc gac tac 2 88 

Asp Ala Lys Ser Cys Ala Gin Asn Cys Cys Val Asp Gly Ala Asp Tyr 

85 90 95 

acc age acc tat ggc ate acc acc aac ggt gat tec ctg age etc aag 336 
Thr Ser Thr Tyr Gly lie Thr Thr Asn Gly Asp Ser Leu Ser Leu Lys 

100 105 110 

ttc gtc acc aag ggc cag cac teg acc aac gtc ggc teg cgt acc tac 384 
Phe Val Thr Lys Gly Gin His Ser Thr Asn Val Gly Ser Arg Thr Tyr 
115 120 125 

ctg atg gac ggc gag gac aag tat cag acc ttc gag etc etc ggc aac 4 32 

Leu Met Asp Gly Glu Asp Lys Tyr Gin Thr Phe Glu Leu Leu Gly Asn 
130 135 140 

gag ttc acc ttc gat gtc gat gtc tec aac ate ggc tgc ggt etc aac 480 
Glu Phe Thr Phe Asp Val Asp Val Ser Asn lie Gly Cys Gly Leu Asn 
145 150 155 160 

ggc gcc ctg tac ttc gtc tec atg gac gcc gat ggt ggt etc age cgc 528 
Gly Ala Leu Tyr Phe Val Ser Met Asp Ala Asp Gly Gly Leu Ser Arg 

165 170 175 

tat cct ggc aac aag get ggt gcc aag tac ggt acc ggc tac tgc gat 576 
Tyr Pro Gly Asn Lys Ala Gly Ala Lys Tyr Gly Thr Gly Tyr Cys Asp 

180 185 190 

get cag tgc ccc cgt gac ate aag ttc ate aac ggc gag gcc aac att 624 
Ala Gin Cys Pro Arg Asp lie Lys Phe lie Asn Gly Glu Ala Asn lie 
195 200 205 

gag ggc tgg acc ggc tec acc aac gac ccc aac gcc ggc gcg ggc cgc 672 
Glu Gly Trp Thr Gly Ser Thr Asn Asp Pro Asn Ala Gly Ala Gly Arg 
210 215 220 

tat ggt acc tgc tgc tct gag atg gat ate tgg gaa gcc aac aac atg 72 0 

Tyr Gly Thr Cys Cys Ser Glu Met Asp lie Trp Glu Ala Asn Asn Met 
225 230 235 240 

get act gcc ttc act cct cac cct tgc acc ate att ggc cag age cgc 768 
Ala Thr Ala Phe Thr Pro His Pro Cys Thr lie lie Gly Gin Ser Arg 
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245 



250 
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255 



tgc gag ggc gac teg tgc ggt ggc acc tac age aac gag cgc tac gec 816 
Cys Glu Gly Asp Ser Cys Gly Gly Thr Tyr Ser Asn Glu Arg Tyr Ala 

260 265 270 

ggc gtc tgc gac ccc gat ggc tgc gac ttc aac teg tac cgc cag ggc 864 
Gly Val Cys Asp Pro Asp Gly Cys Asp Phe Asn Ser Tyr Arg Gin Gly 
275 280 285 

aat aag acc ttc tac ggc aag ggc atg acc gtc gac acc acc aag aag 912 
Asn Lys Thr Phe Tyr Gly Lys Gly Met Thr Val Asp Thr Thr Lys Lys 
290 295 300 

ate act gtc gtc acc cag ttc etc aag gat gee aac ggc gat etc ggc 960 
lie Thr Val Val Thr Gin Phe Leu Lys Asp Ala Asn Gly Asp Leu Gly 
305 310 315 320 

gag gtc aag cgc ttc tac gtc cag gat ggc aag ate ate ccc aac tec 1008 
Glu Val Lys Arg Phe Tyr Val Gin Asp Gly Lys lie lie Pro Asn Ser 

325 330 335 

gag tec acc ate ccc ggc gtc gag ggc aat tec ate acc cag gac tgg 1056 
Glu Ser Thr lie Pro Gly Val Glu Gly Asn Ser lie Thr Gin Asp Trp 

340 345 350 

tgc 9^ cgc cag aag gtt gee ttt ggc gac att gac gac ttc aac cgc 1104 
Cys Asp Arg Gin Lys Val Ala Phe Gly Asp lie Asp Asp Phe Asn Arg 
355 360 365 

aa g 99^ ggc atg aag cag atg ggc aag gee etc gee ggc ccc atg gtc 1152 
Lys Gly Gly Met Lys Gin Met Gly Lys Ala Leu Ala Gly Pro Met Val 
370 375 380 

ctg gtc atg tec ate tgg gat gac cac gee tec aac atg etc tgg etc 1200 
Leu Val Met Ser lie Trp Asp Asp His Ala Ser Asn Met Leu Trp Leu 
385 390 395 400 

gac teg acc ttc cct gtc gat gee get ggc aag ccc ggc gee gag cgc 1248 
Asp Ser Thr Phe Pro Val Asp Ala Ala Gly Lys Pro Gly Ala Glu Arg 

405 410 415 

ggt gee tgc ccg acc acc teg ggt gtc cct get gag gtt gag gee gag 12 96 

Gly Ala Cys Pro Thr Thr Ser Gly Val Pro Ala Glu Val Glu Ala Glu 

420 425 430 

gee ccc aac age aac gtc gtc ttc tec aac ate cgc ttc ggc ccc ate 1344 
Ala Pro Asn Ser Asn Val Val Phe Ser Asn lie Arg Phe Gly Pro lie 
435 440 445 

ggc teg acc gtt get ggt etc ccc ggc gcg ggc aac ggc ggc aac aac 13 92 

Gly Ser Thr Val Ala Gly Leu Pro Gly Ala Gly Asn Gly Gly Asn Asn 
450 455 460 

ggc ggc aac ccc ccg ccc ccc acc acc acc acc tec teg get ccg gee 1440 
Gly Gly Asn Pro Pro Pro Pro Thr Thr Thr Thr Ser Ser Ala Pro Ala 
465 470 475 480 

acc acc acc acc gee age get ggc ccc aag get ggc cac tgg cag cag 1488 
Thr Thr Thr Thr Ala Ser Ala Gly Pro Lys Ala Gly His Trp Gin Gin 

485 490 495 
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tgc ggc ggc ate ggc ttc act ggc ccg acc cag tgc gag gag ccc tac 153 6 

Cys Gly Gly lie Gly Phe Thr Gly Pro Thr Gin Cys Glu Glu Pro Tyr 

500 505 510 

act tgc acc aag etc aac gac tgg tac tct cag tgc ctg taa 1578 
Thr Cys Thr Lys Leu Asn Asp Trp Tyr Ser Gin Cys Leu 
515 520 ~ 525 



<210> 60 
<211> 525 
<212> PRT 

<213> Scytalidium thermophilum 
<400> 60 

Met Arg Thr Ala Lys Phe Ala Thr Leu Ala Ala Leu Val Ala Ser Ala 
15 10 15 

Ala Ala Gin Gin Ala Cys Ser Leu Thr Thr Glu Arg His Pro Ser Leu 

20 25 30 

Ser Trp Lys Lys Cys Thr Ala Gly Gly Gin Cys Gin Thr Val Gin Ala 
35 40 45 

Ser lie Thr Leu Asp Ser Asn Trp Arg Trp Thr His Gin Val Ser Gly 
50 55 60 

Ser Thr Asn Cys Tyr Thr Gly Asn Lys Trp Asp Thr Ser lie Cys Thr 
65 70 75 80 

Asp Ala Lys Ser Cys Ala Gin Asn Cys Cys Val Asp Gly Ala Asp Tyr 

85 90 95 

Thr Ser Thr Tyr Gly lie Thr Thr Asn Gly Asp Ser Leu Ser Leu Lys 

100 105 110 

Phe Val Thr Lys Gly Gin His Ser Thr Asn Val Gly Ser Arg Thr Tyr 
115 120 125 

Leu Met Asp Gly Glu Asp Lys Tyr Gin Thr Phe Glu Leu Leu Gly Asn 
130 135 140 

Glu Phe Thr Phe Asp Val Asp Val Ser Asn lie Gly Cys Gly Leu Asn 
145 150 155 160 

Gly Ala Leu Tyr Phe Val Ser Met Asp Ala Asp Gly Gly Leu Ser Arg 

165 170 175 

Tyr Pro Gly Asn Lys Ala Gly Ala Lys Tyr Gly Thr Gly Tyr Cys Asp 

180 185 190 

Ala Gin Cys Pro Arg Asp lie Lys Phe lie Asn Gly Glu Ala Asn lie 
195 200 205 

Glu Gly Trp Thr Gly Ser Thr Asn Asp Pro Asn Ala Gly Ala Gly Arg 
210 215 220 

Tyr Gly Thr Cys Cys Ser Glu Met Asp He Trp Glu Ala Asn Asn Met 
225 230 235 240 



Ala Thr Ala Phe Thr 



Pro His Pro Cys Thr He He Gly Gin Ser Arg 
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245 



250 



255 



Cys Glu Gly Asp Ser Cys Gly Gly Thr Tyr Ser Asn Glu Arg Tyr Ala 

260 265 270 

Gly Val Cys Asp Pro Asp Gly Cys Asp Phe Asn Ser Tyr Arg Gin Gly 
275 280 285 

Asn Lys Thr Phe Tyr Gly Lys Gly Met Thr Val Asp Thr Thr Lys Lys 
290 295 300 

lie Thr Val Val Thr Gin Phe Leu Lys Asp Ala Asn Gly Asp Leu Gly 
305 310 315 320 

Glu Val Lys Arg Phe Tyr Val Gin Asp Gly Lys lie lie Pro Asn Ser 

325 330 335 

Glu Ser Thr lie Pro Gly Val Glu Gly Asn Ser lie Thr Gin Asp Trp 

340 345 350 

Cys Asp Arg Gin Lys Val Ala Phe Gly Asp He Asp Asp Phe Asn Arg 
355 360 365 

Lys Gly Gly Met Lys Gin Met Gly Lys Ala Leu Ala Gly Pro Met Val 
370 375 380 

Leu Val Met Ser He Trp Asp Asp His Ala Ser Asn Met Leu Trp Leu 
385 390 395 400 

Asp Ser Thr Phe Pro Val Asp Ala Ala Gly Lys Pro Gly Ala Glu Arg 

405 410 415 

Gly Ala Cys Pro Thr Thr Ser Gly Val Pro Ala Glu Val Glu Ala Glu 

420 425 430 

Ala Pro Asn Ser Asn Val Val Phe Ser Asn He Arg Phe Gly Pro He 
435 440 445 

Gly Ser Thr Val Ala Gly Leu Pro Gly Ala Gly Asn Gly Gly Asn Asn 
450 455 460 

Gly Gly Asn Pro Pro Pro Pro Thr Thr Thr Thr Ser Ser Ala Pro Ala 
465 470 475 480 

Thr Thr Thr Thr Ala Ser Ala Gly Pro Lys Ala Gly His Trp Gin Gin 

485 490 495 

Cys Gly Gly He Gly Phe Thr Gly Pro Thr Gin Cys Glu Glu Pro Tyr 

500 505 510 



Thr Cys Thr Lys Leu Asn Asp Trp Tyr Ser Gin Cys Leu 
515 520 " 525 



<210> 61 

<211> 519 

<212> DNA 

<213> Aspergillus sp. 
<220> 

<221> misc feature 
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<222> (1) . . (519) 

<223> Partial CBH1 encoding sequence 
<400> 61 

gagatggaca tatgggaggc caacagcatc tccacggcct tcacgcccca cccctgcgat 60 

gtccccggcc aggtgatgtg cgagggcgac tcctgcggtg gcacctacag cagcgaccgc 12 0 

tatggcggca cctgcgatcc cgatggatgt gacttcaact cctaccgcca gggcaacaag 180 

tccttctacg gccccggcat gaccgtcgac accaacagca aggtcaccgt cgtgactcag 240 

ttcctcaccg acgacggcac tgccaccggc accctgtcgg agatcaagcg gttctacgtg 300 

cagaacggca aggtcatccc caactccgag tcgacctggc ccggcgtcgg cggcaactcc 360 

atcaccaccg actactgtct ggcccagaag agcctcttcg gcgataccga cgtcttcacc 420 

aagcacggcg gtatggaggg catgggcgcc gccctcgccg agggcatggt cctcgtcctg 48 0 

agtctctggg acgaccacca ctccaacatg ctctggctg 519 

<210> 62 
<211> 497 
<212> DNA 

<213> Scopulariopsis sp. 
<220> 

<221> misc_f eature 
<222> (1) . . (497) 

<223> Partial CBH1 encoding sequence 
<400> 62 

gagatcgatg tgtgggagtc gaacgcctat gccttcgttt tcacgccgca cgcgtgcacg 60 
accaacgagt accacgtctg cgagaccacc aactgcggtg gcacctactc ggaggaccgc 12 0 
ttcaccggca agtgcgacgc caacggctgc gactacaacc cctaccgcat gggcaacccc 180 
gacttctacg gcaagggcaa gacgctcgac accagccgca agttcaccgt cgtctcccgc 240 
ttcgaggaga acaagctctc ccagtacttc atccaggacg gccgcaagat cgagatcccg 3 00 
ccgccgacgt gggagggcat gcccaacagc agcgagatca cccccgagct ctgctccacc 360 
atgttcgatg tgctcgacga ccgcaaccgc ttgcaggagg tcggcggctt cgagcagctg 42 0 
aacaacgccc tccgggttcc catggtcctc gtcatgtcca tctgggacga ccactacgcc 480 
aacatgctct ggctcga 497 

<210> 63 

<211> 498 

<212> DNA 

<213> Fusarium sp . 



<220> 

<221> misc feature 
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<222> (1) . . (498) 

<223> Partial CBH1 encoding sequence 
<400> 63 

gagatggata tctgggaggc caacaagatc tccactgcct acactcccca cccctgcaag 6 0 

agcctcaccc agcagtcctg cgagggcgat gcctgcggtg gcacctactc tactacccgc 12 0 

tatgctggaa cttgcgaccc cgatggttgc gatttcaacc cttaccgcca gggcaacaag 180 

accttctacg gccccggctc cggcttcaac gttgatacca ccaagaaggt gactgtcgtg 24 0 

acccagttca tcaagggcag cgacggcaag ctttccgaga tcaagcgtct ctatgttcag 3 00 

aatggcaagg tcattggcaa cccccagtct gagattgcca gcaaccctgg cagcagcgtc 360 

accgacagct tctgcaaggc ccagaaggtt gccttcaacg accccgatga cttcaacaag 42 0 

aagggtggct ggagcggaat gagcgacgcc ctcgccaagc ccatggttct cgtcatgagc 480 

ttgtggcacg acgtgagt 4 98 

<210> 64 

<211> 525 

<212> DNA 

<213> Verticillium sp . 
<220> 

<221> misc_f eature 

<222> (1) . . (525) 

<223> Partial CBH1 encoding sequence 

<400> 64 



gagatggata 


tctgggaggc 


caacaagatc 


tccacggcct 


acactcccca 


tccctgcaag 


60 


agcctcaccc 


agcagtcctg 


tgagggcgat 


gcctgcggtg 


gcacctactc 


ttccacccgc 


120 


tatgctggaa 


cttgcgatcc 


cgatggctgc 


gatttcaacc 


cttaccgcca 


gggcaaccac 


180 


accttctacg 


gtcccggctc 


cggcttcaac 


gtcgatacca 


ccaagaaggt 


gactgtcgtg 


240 


acccagttca 


tcaagggcag 


cgacggcaag 


ctttccgaga 


tcaagcgtct 


ctatgttcag 


300 


aatggcaagg 


tcatcggcaa 


cccccagtcc 


gagattgcaa 


acaaccccgg 


cagctccgtc 


360 


accgacagct 


tctgcaaggc 


ccagaaggtt 


gccttcaacg 


accccgatga 


cttcaacaag 


420 


aagggtggct 


ggagcggcat 


gaacgacgcc 


ctcgccaagc 


ccatggttct 


cgtcatgagc 


480 


ctgtggcacg 


acgtgagtaa 


tctaacccct 


gagtctcgga 


caaga 




525 



<210> 65 

<211> 1371 

<212> DNA 

<213> Pseudoplectania nigrella 
<220> 

<221> CDS 
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<222> (1) . . (1371) 
<223> 

<400> 65 

atg eta tec aat etc ctt etc tea etc tct ttc ctt tec eta gee tec 48 
Met Leu Ser Asn Leu Leu Leu Ser Leu Ser Phe Leu Ser Leu Ala Ser 
15 10 15 

ggg caa aac ate ggt ace aac ace gee gaa age cac ccc caa ctt cgt 96 
Gly Gin Asn lie Gly Thr Asn Thr Ala Glu Ser His Pro Gin Leu Arg 

20 25 30 

tct caa ace tgc acc aaa ggc aac gga tgc age acc caa tec ace tec 144 
Ser Gin Thr Cys Thr Lys Gly Asn Gly Cys Ser Thr Gin Ser Thr Ser 
35 40 45 

gta gtc ctg gac tec aac tgg cgc tgg ctg cac aat aat gga ggt tea 192 
Val Val Leu Asp Ser Asn Trp Arg Trp Leu His Asn Asn Gly Gly Ser 
50 55 60 

acg aac tgc tac acc ggc aat tec tgg gac tct aca tta tgt ccc gac 240 
Thr Asn Cys Tyr Thr Gly Asn Ser Trp Asp Ser Thr Leu Cys Pro Asp 
65 70 75 80 

cca gtt acc tgc gec aag aac tgt get etc gac ggt gee gac tat tct 288 
Pro Val Thr Cys Ala Lys Asn Cys Ala Leu Asp Gly Ala Asp Tyr Ser 

85 90 95 

ggg aca tac gga ate acc tct acg gga gat get ttg acg ttg aag ttt 336 
Gly Thr Tyr Gly lie Thr Ser Thr Gly Asp Ala Leu Thr Leu Lys Phe 

100 105 110 

gtt act cag ggt cct tat teg act aat att gga tct egg gta tac eta 384 
Val Thr Gin Gly Pro Tyr Ser Thr Asn lie Gly Ser Arg Val Tyr Leu 
115 120 125 

atg gcg agt gat act cag tat aag atg ttc cag etc aag aac aag gag 4 32 

Met Ala Ser Asp Thr Gin Tyr Lys Met Phe Gin Leu Lys Asn Lys Glu 
130 135 140 

ttt acg ttt gat gtt gat gtc tct aat ctt cct tgt gga tta aac gga 480 
Phe Thr Phe Asp Val Asp Val Ser Asn Leu Pro Cys Gly Leu Asn Gly 
145 150 155 160 

gcg ttg tat ttt gtg gag atg gat gcg gat gga gga atg teg aaa tac 528 
Ala Leu Tyr Phe Val Glu Met Asp Ala Asp Gly Gly Met Ser Lys Tyr 

165 170 175 

ccg tct aat aaa gee ggg gca aaa tat gga acc ggg tat tgt gat gcg 576 
Pro Ser Asn Lys Ala Gly Ala Lys Tyr Gly Thr Gly Tyr Cys Asp Ala 

180 185 190 

cag tgt cca cat gat ate aaa ttt ate aac ggg gag gca aat etc eta 624 
Gin Cys Pro His Asp lie Lys Phe lie Asn Gly Glu Ala Asn Leu Leu 
195 200 205 

gac tgg acg cct tea acc age gac aaa aat gec ggc tec gga cgt tac 672 
Asp Trp Thr Pro Ser Thr Ser Asp Lys Asn Ala Gly Ser Gly Arg Tyr 
210 215 220 



ggg acc tgt tgt caa gaa atg gac ate tgg gaa gec aac age atg gca 72 0 
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Gly Thr Cys Cys Gin Glu Met Asp lie Trp Glu Ala Asn Ser Met Ala 
225 230 235 240 

acc gcc tat aca ccg cat ccc tgt agt gtc tea gga cct acc cga tgc 768 
Thr Ala Tyr Thr Pro His Pro Cys Ser Val Ser Gly Pro Thr Arg Cys 

245 250 255 

tea gga acc caa tgt ggg gat ggt tct aac cgt cat aac gga att tgc 816 
Ser Gly Thr Gin Cys Gly Asp Gly Ser Asn Arg His Asn Gly lie Cys 

260 265 270 

gat aaa gat ggc tgc gat ttc aat tec tac cgt atg ggc aat acg aca 864 
Asp Lys Asp Gly Cys Asp Phe Asn Ser Tyr Arg Met Gly Asn Thr Thr 
275 280 285 

ttc ttc ggc aag gga gca acg gtt aac acc aac tec aaa ttt act gtt 912 
Phe Phe Gly Lys Gly Ala Thr Val Asn Thr Asn Ser Lys Phe Thr Val 
290 295 300 

gta acg caa ttc ate acc tec gac aac acc tea act gga gcg eta aag 960 
Val Thr Gin Phe lie Thr Ser Asp Asn Thr Ser Thr Gly Ala Leu Lys 
305 310 315 320 

gag att cgt cgt ctt tat att cag aat gga aaa gtc ate cag aac teg 1008 
Glu lie Arg Arg Leu Tyr lie Gin Asn Gly Lys Val lie Gin Asn Ser 

325 330 335 

aaa agt aat ate tec ggc atg tea get tac gac tct ata acc gag gat 1056 
Lys Ser Asn lie Ser Gly Met Ser Ala Tyr Asp Ser lie Thr Glu Asp 

340 345 350 

ttc tgt gcc get caa aaa acc gca ttt gga gac aca aat gac ttt aag 1104 
Phe Cys Ala Ala Gin Lys Thr Ala Phe Gly Asp Thr Asn Asp Phe Lys 
355 360 365 

gca aag ggc gga ttt aca aac ctt ggg aat gcg ttg caa aag gga atg 1152 
Ala Lys Gly Gly Phe Thr Asn Leu Gly Asn Ala Leu Gin Lys Gly Met 
370 375 380 

gtt ttg gcg ttg agt att tgg gat gat cat get gcg cag atg ctt tgg 1200 
Val Leu Ala Leu Ser lie Trp Asp Asp His Ala Ala Gin Met Leu Trp 
385 390 395 400 

ttg gat agt tct tac ccg etc gat aaa gac cct tct caa cca ggt gtt 1248 
Leu Asp Ser Ser Tyr Pro Leu Asp Lys Asp Pro Ser Gin Pro Gly Val 

405 410 415 

aag agg ggc gcg tgt get acc tct tct ggt aaa ccg teg gat gtc gag 12 96 

Lys Arg Gly Ala Cys Ala Thr Ser Ser Gly Lys Pro Ser Asp Val Glu 

420 425 430 

aac cag tct ccg aat gcg teg gtg act ttt teg aac att aag ttt ggg 1344 
Asn Gin Ser Pro Asn Ala Ser Val Thr Phe Ser Asn lie Lys Phe Gly 
435 440 445 

gat att gga teg act tat tec tct tag 1371 
Asp lie Gly Ser Thr Tyr Ser Ser 
450 455 



<210> 66 
<211> 456 
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<212> PRT 

<213> Pseudoplectania nigrella 
<400> 66 

Met Leu Ser Asn Leu Leu Leu Ser Leu Ser Phe Leu Ser Leu Ala Ser 
1 5 10 15 

Gly Gin Asn lie Gly Thr Asn Thr Ala Glu Ser His Pro Gin Leu Arg 

20 25 30 

Ser Gin Thr Cys Thr Lys Gly Asn Gly Cys Ser Thr Gin Ser Thr Ser 
35 40 45 

Val Val Leu Asp Ser Asn Trp Arg Trp Leu His Asn Asn Gly Gly Ser 
50 55 60 

Thr Asn Cys Tyr Thr Gly Asn Ser Trp Asp Ser Thr Leu Cys Pro Asp 
65 70 75 80 

Pro Val Thr Cys Ala Lys Asn Cys Ala Leu Asp Gly Ala Asp Tyr Ser 

85 90 95 

Gly Thr Tyr Gly He Thr Ser Thr Gly Asp Ala Leu Thr Leu Lys Phe 

100 105 110 

Val Thr Gin Gly Pro Tyr Ser Thr Asn He Gly Ser Arg Val Tyr Leu 
115 120 125 

Met Ala Ser Asp Thr Gin Tyr Lys Met Phe Gin Leu Lys Asn Lys Glu 
130 135 140 

Phe Thr Phe Asp Val Asp Val Ser Asn Leu Pro Cys Gly Leu Asn Gly 
145 150 155 160 

Ala Leu Tyr Phe Val Glu Met Asp Ala Asp Gly Gly Met Ser Lys Tyr 

165 170 175 

Pro Ser Asn Lys Ala Gly Ala Lys Tyr Gly Thr Gly Tyr Cys Asp Ala 

180 185 190 

Gin Cys Pro His Asp He Lys Phe He Asn Gly Glu Ala Asn Leu Leu 
195 200 205 

Asp Trp Thr Pro Ser Thr Ser Asp Lys Asn Ala Gly Ser Gly Arg Tyr 
210 215 220 

Gly Thr Cys Cys Gin Glu Met Asp He Trp Glu Ala Asn Ser Met Ala 
225 230 235 240 

Thr Ala Tyr Thr Pro His Pro Cys Ser Val Ser Gly Pro Thr Arg Cys 

245 250 255 

Ser Gly Thr Gin Cys Gly Asp Gly Ser Asn Arg His Asn Gly He Cys 

260 265 270 

Asp Lys Asp Gly Cys Asp Phe Asn Ser Tyr Arg Met Gly Asn Thr Thr 
275 280 285 

Phe Phe Gly Lys Gly Ala Thr Val Asn Thr Asn Ser Lys Phe Thr Val 
290 295 300 
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Val Thr Gin Phe lie Thr Ser Asp Asn Thr Ser Thr Gly Ala Leu Lys 
305 310 315 320 

Glu lie Arg Arg Leu Tyr lie Gin Asn Gly Lys Val lie Gin Asn Ser 

325 330 335 

Lys Ser Asn lie Ser Gly Met Ser Ala Tyr Asp Ser lie Thr Glu Asp 

340 345 350 

Phe Cys Ala Ala Gin Lys Thr Ala Phe Gly Asp Thr Asn Asp Phe Lys 
355 360 365 

Ala Lys Gly Gly Phe Thr Asn Leu Gly Asn Ala Leu Gin Lys Gly Met 
370 375 380 

Val Leu Ala Leu Ser lie Trp Asp Asp His Ala Ala Gin Met Leu Trp 
385 390 395 400 

Leu Asp Ser Ser Tyr Pro Leu Asp Lys Asp Pro Ser Gin Pro Gly Val 

405 410 415 

Lys Arg Gly Ala Cys Ala Thr Ser Ser Gly Lys Pro Ser Asp Val Glu 

420 425 430 

Asn Gin Ser Pro Asn Ala Ser Val Thr Phe Ser Asn lie Lys Phe Gly 
435 440 445 

Asp lie Gly Ser Thr Tyr Ser Ser 
450 455 



<210> 67 

<211> 951 

<212> DNA 

<213> Phytophthora infestans 
<220> 

<221> misc_f eature 

<222> (1) . . (951) 

<223> Partial CBH1 encoding sequence 

<400> 67 



tgcgatgctg 


atggttgtga 


cttcaactct 


taccgccagg 


gtaacacctc 


tttctatggt 


60 


gcaggtctta 


ccgtgaacac 


caacaaagtt 


ttcaccgttg 


taacccaatt 


catcaccaac 


120 


gatggaacag 


cttcaggtac 


cttgaaagaa 


atccgacgat 


tctatgttca 


gaatggcgtc 


180 


gtgattccaa 


actcgcaatc 


cacaatcgct 


ggagttccag 


gaaattccat 


caccgactct 


240 


ttctgtgccg 


cacaaaagac 


tgcttttggt 


gacaccaacg 


aattcgctac 


taagggaggt 


300 


cttgccacaa 


tgagcaaagc 


tttggcaaag 


ggtatggtac 


ttgtcatgtc 


catttgggat 


360 


gaccataccg 


ccaacatgtt 


gtggctcgat 


gccccttacc 


cagcaaccaa 


atccccaagc 


420 


gccccaggtg 


tcactcgagg 


atcatgcagt 


gctacttcag gtaaccccgt 


tgatgttgaa 


480 


gccaattctc 


caggttcttc 


cgtcaccttc 


tcaaacatca 


agtggggtcc 


catcaactct 


540 


acctacactg 


gatctggagc 


cgccccaagt 


gttccaggca 


ctacaaccgt 


tagctcggca 


600 
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cccgcatcga ctgcaacttc aggagctggt ggtgtcgcta agtatgccca atgtggaggt 660 

actggataca gtggagctac cgcttgcgtt tcaggcagca cctgtgttgc cctcaaccct 720 

tactactccc aatgccaata gattgtttcc ctcaggagca attaggtttc caacctaagg 780 

ggagagatct tcacaagtct gtacataggg tcagctaaat gttgatcatt catattcttt 840 

catgtattta gttgttgaca atttgaagtt gcaagtcaag acgggaaaac agaagcagga 900 

aatatatggg acataacaaa gtcaatcgtt tacataagaa ccttctttaa a 951 
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